Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
System Performance Analysis Considering Human-related Factors
by
Ashkan Corey Kiassat
A thesis submitted in conformity with requirements for the degree of Doctor of Philosophy
Graduate Department of Mechanical and Industrial Engineering
University of Toronto
© Copyright by Ashkan Corey Kiassat, 2013
ii
System Performance Analysis Considering Human-related Factors
Ashkan Corey Kiassat
Doctor of Philosophy
Department of Mechanical and Industrial Engineering
University of Toronto
2013
ABSTRACT
All individuals are unique in their characteristics. As such, their positive and negative
contributions to system performance differ. In any system that is not fully automated, the
effect of the human participants has to be considered when one is interested in the
performance optimization of the system. Humans are intelligent, adaptive, and learn over
time. At the same time, humans are error-prone. Therefore, in situations where human and
hardware have to interact and complement each other, the system faces advantages and
disadvantages from the role the humans play. It is this role and its effect on performance
that is the focus of this dissertation.
When analyzing the role of people, one can focus on providing resources to enable the
human participants to produce more. Alternatively, one can strive to ensure the occurrence
of less frequent and impactful errors. The focus of the analysis in this dissertation is the
latter.
Our analysis can be categorized into two parts. In the first part of our analysis, we
consider a short term planning horizon and focus directly on failure risk analysis. What can
iii
be done about the risk stemming from the human participant? Any proactive steps that can
be taken will have the advantage of reducing risk, but will also have a cost associated with
it. We develop a cost-benefit analysis to enable a decision-maker to choose the optimal
course of action for revenue maximization. We proceed to use this model to calculate the
minimum acceptable level of risk, and the associated skill level, to ensure system
profitability. The models developed are applied to a case study that comes from a
manufacturing company in Ontario, Canada.
In the second part of our analysis, we consider a longer planning horizon and are
focused on output maximization. Human learning, and its effect on output, is considered. In
the first model we develop, we use learning curves and production forecasting models to
optimally assign operators, in order to maximize system output. In the second model we
develop, we perform a failure risk analysis in combination with learning curves, to forecast
the total production of operators. Similar to the first part of our analysis, we apply the
output maximization models to the aforementioned case study to better demonstrate the
concepts.
iv
To the one who inspired me to embark on this journey
v
ACKNOWLEDGEMENTS
The past four years have been a fantastic journey and certainly one of the best periods of
my life. Having been away from school for 11 years, there were many difficulties along the
way, academic and otherwise. But getting to the end is even sweeter because of all the
challenges. I made the decision to start this journey with much hesitation and uncertainty; I
look back and consider my decision to be one of the best things I have done in my life.
Professor Andrew K. S. Jardine, you encouraged me to join the lab and to pursue this
dream, right from the very first email I got from you. Thank you for having me at C-MORE
and the support you have provided along the way. I have very much enjoyed the freedom
you granted me along the way to pursue my own research and teaching interests.
Dr. Dragan Banjevic, your extraordinary support throughout the last four years needs an
enormous acknowledgement. You are amazingly sharp in many subject matters and this,
combined with your good heart and your great willingness to help, makes for the best
support a PhD student can ask for. Your assistance has been absolutely essential along the
way. Thank you for always being there.
My great friend, Dr. Nima Safaei, you are brilliant; thank you for sharing your knowledge
with me. You have guided me and helped me mature in performing research and writing
papers. We have become great friends over the last four years and I value this friendship
very much. I wish for this friendship and our research collaborations to continue for years to
come.
vi
Professor Mark Chignell, I value your wisdom, your big-picture thinking, and the fact that
you have always challenged me to be a better scientist. You have always made time for me,
despite your extremely busy schedule. Your great support has been beyond just serving on
my committee. Thank you for the consulting opportunities and your continuous support in
my search for an academic position. Your kindness is much appreciated.
Professor Ann Armstrong, you have always made sure my research remains people-centric.
In the end, I want the human resources of an organization to make a positive contribution
to the success of the organization. You taught me to think of the people in a positive light,
to ensure my work contributes to the nurturing of the human resources, rather than
thinking of them as a nuisance and a source of error. Thank you for providing this alternate
mindset. You also provided me with my first teaching opportunity. Thank you for the trust.
You helped me find my calling in life.
Dr. Bob Platfoot, my external examiner, thank you for taking an in-depth look at my thesis
and providing great feedback. Your positive criticism and comments to reinforce my
findings and assumptions have strengthened my work.
Professor Roy Kwon, thank you for participating in the oral examination procedure. You
came in at the late stages and made time to become familiar with my work. Your comments
are much appreciated.
To my friends at C-MORE, in particular my new sisters Janet and Lorna, thank you for
helping me keep my sanity. I very much value our friendships and will always remember
your support and acts of kindness.
vii
To my mom, dad, and brother, I feel blessed to have you. Thank you for your support; I am
glad to celebrate this achievement with you and to see your pride. My spirit seeks
challenges and I always aim for the top. I have these values because of you.
To my courageous aunt, Shahraz, so much of what I am and what I have is because of you.
You have always been there for me, no matter what. Thank you for all your kindness
throughout my entire life.
To Nirvana, eshgham, now and always, thank you for being you, thank you for pushing me
to do this, to make the best of myself. You were the inspiration that got me going on this
path. Having found my calling in life, and the subsequent happiness for years to come, is
because of you.
viii
PREFACE
The following papers have resulted from the work discussed in this dissertation:
1. Kiassat, C., Safaei, N., (January 2012). Optimizing Maintenance Policies When Human-
Related Factors Are Included in the Proportional Hazards Model. European Journal of
Operational Research, second review.
2. Kiassat, C., Safaei, N., Banjevic, D. (September 2012). Effects of Operator Learning on
Production Output: A Markov Chain Approach. Journal of the Operational Research
Society, second review.
3. Kiassat, C., Konstandinidou, M., (May 2010). Recognizing Significant Human-Related
Factors Affecting System Reliability. Production & Operation Management Society
(POMS) conference, Vancouver, BC, Canada. Abstract # 015-0109.
4. Centrone, D., Kiassat, C., Garetti, M., Banjevic, D., Jardine, A., (May 2010). Proportional
Hazard Model: A Valuable Methodology for Sustainable Manufacturing. Proceedings of
Maintenance for Sustainable Manufacturing (M4SM) conference, Verona, Italy. 51-56.
5. Kiassat, C., Safaei, N., (September 2009). Integrating Human Reliability Analysis into a
Comprehensive Maintenance Optimization Strategy. Proceedings of World Congress on
Engineering Asset Management (WCEAM), Athens, Greece. 561-566.
6. Kiassat, C., Banjevic, D., Safaei, N. Using the Profitability Threshold Calculated from a
Risk of Failure Model to Determine the Minimum Level of Human-related Factors.
(pending submission, extension of paper under second review at European Journal of
Operational Research).
ix
Contents ABSTRACT ............................................................................................................................................... ii
ACKNOWLEDGEMENTS .......................................................................................................................... v
PREFACE .............................................................................................................................................. viii
LIST OF TABLES .................................................................................................................................. xi
LIST OF FIGURES ................................................................................................................................... xiv
ABBREVIATIONS ................................................................................................................................... xvi
1. INTRODUCTION .................................................................................................................................. 1
2. QUANTIFICATION OF HUMAN-RELATED FACTORS ............................................................................ 8
2.1. Literature review on quantification techniques .......................................................................... 8
2.2. Usage of Critical Incident Technique ......................................................................................... 15
2.2.1. Selection of Experts ............................................................................................................ 18
2.2.2 Data Collection Tools........................................................................................................... 20
2.3. Concluding remarks ................................................................................................................... 22
3. FAILURE RISK ANALYSIS .................................................................................................................... 23
3.1. Introduction .............................................................................................................................. 24
3.2. Literature Review ...................................................................................................................... 26
3.3. Evaluating Intervention Methods for Human-Related Risk ...................................................... 28
3.4. Developing an Evaluation Model for Intervention Methods .................................................... 30
3.5. Determining the minimum skill level ........................................................................................ 35
3.6. Failure Risk Analysis – An Empirical Study ................................................................................ 41
3.6.1. Quantification of skill and shift work ................................................................................. 46
3.6.2. Analyzing the risk of failure ................................................................................................ 52
3.6.3. Odds estimates ................................................................................................................... 60
3.6.4. Evaluation model for intervention methods ...................................................................... 62
3.6.5. Expanded data set, additional factors ................................................................................ 74
3.6.5.1. Discussion on the model’s terms ................................................................................ 77
3.6.5.2. Procedure for developing the model .......................................................................... 79
3.6.5.3. A more parsimonious PHM ......................................................................................... 82
3.6.5.4. Revenue Function and Discussion ............................................................................... 84
3.7. Concluding Remarks and Future Work ..................................................................................... 89
4. OPTIMAL OPERATOR ASSIGNMENT ................................................................................................. 92
x
4.1. Literature review ....................................................................................................................... 94
4.2. Model Development ................................................................................................................. 96
4.3. Optimization Model ................................................................................................................ 100
4.4. Empirical Study ........................................................................................................................ 101
4.4.1. Predicting Output in Terms of Human-related Factors .................................................... 102
4.4.2. Learning Curves ................................................................................................................ 103
4.4.3. Revenue Model Using Regression Equations and Learning Curves ................................. 107
4.4.4. Optimization Model and Discussion ................................................................................ 108
4.4.5. Optimal Solution Compared to Solutions of Other Methods .......................................... 108
4.4.6. Sensitivity Analysis ........................................................................................................... 110
4.5. Concluding Remarks and Future work .................................................................................... 116
5. EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT ..................................................... 119
5.1. Literature Review .................................................................................................................... 122
5.2. Markov Chain Approach .......................................................................................................... 124
5.3. Empirical Study ........................................................................................................................ 131
5.3.1. Model Validation .............................................................................................................. 137
5.4. Concluding Remarks and Future Work ................................................................................... 141
6. CONCLUSION .................................................................................................................................. 145
6.1. Central Ideas and Contributions ............................................................................................. 145
6.2. Concluding Remarks and Recap .............................................................................................. 148
REFERENCES ....................................................................................................................................... 152
APPENDIX A1: Observational Study ................................................................................................... 167
APPENDIX A2: System Experts Assessing Operators .......................................................................... 170
APPENDIX A3: Self-assessment Questionnaire .................................................................................. 172
APPENDIX B: Discussion on Logistic Regression as a Validation Tool ................................................ 175
APPENDIX C: Discussion on Obtaining the PHM ................................................................................ 180
APPENDIX D: Goodness of Fit of Regression Models for Predicting Output ...................................... 188
APPENDIX E: Model Properties .......................................................................................................... 192
APPENDIX F: Model and Property Generalization with Initial Conditions ......................................... 194
APPENDIX G: Data Set Used for Empirical Studies ............................................................................. 198
xi
LIST OF TABLES
3.1: Initial Shift-to-Shift and Operator-to-Operator Differences in Failure Occurrences 54
3.2: Summary of Estimated Parameters 55
3.3: Univariate and Bivariate Statistics on Skill Components 57
3.4: Summary of Estimated Parameters for Ring gear 58
3.5: Odds Estimates 61
3.6: Odds Estimates 61
3.7: Combinations of intervention methods to be considered 66
3.8: Values of and the corresponding function value 70
3.9: Value of for the four cases 72
3.10: original variables contained in covariates 73
3.11: Summary of Estimated Parameters 76
3.12: Model Selection Method and Results 81
3.13: Summary of Estimated Parameters 82
3.14: Summary of Estimated Parameters 83
3.15: Variables represented by the covariates in the function 87
4.1: Regression Equation Coefficients, significant at p < 0.01 103
4.2: Learning Curves of Operators’ Skill Components 107
4.3: Operators’ Expected Quarterly Revenues 108
4.4: Operators’ Skill Components 109
4.5: Comparing system revenue under various OA approaches 110
xii
4.6: Operator Assignment for Different Gear Prices 112
4.7: System Revenue under various OA Scenarios 114
4.8: System Revenue under different production times 116
5.1: Regression Equation Coefficients, significant at p < 0.01 133
5.2: Learning curves of operators’ skill components 134
5.3: Expected production output for each operator on each machine 135
5.4: Comparing Alpha’s revenue under various OA approaches 136
5.5: Comparing obtained revenue under approaches of Chapters 4 and 5 138
5.6: Comparing model results with actual production volumes 140
B1: Summary of Estimated Parameters 176
B2: Summary of Estimated Parameters 178
C1: PHM parameter estimation, using all variables related to operator experience 180
level
C2: PHM parameter estimation continued, variable “Experience” is eliminated 181
C3: PHM parameter estimation continued, variable “ExpAna” is eliminated 181
C4: PHM parameter estimation, final model for the “Experience” group 182
C5: PHM parameter estimation continued from Table C1, variable “ExpAna” is 182
eliminated
C6: PHM parameter estimation continued, variable “Experience” is eliminated 183
C7: PHM parameter estimation continued from Table C1, variable “ExpV2” is 183
eliminated
C8: PHM parameter estimation continued, variable “Experience” is eliminated 184
C9: PHM parameter estimation, final model for the “Social” group 184
xiii
C10: PHM parameter estimation, final model for the “Analytical” group 185
C11: PHM parameter estimation, final model for the “binary terms” group 186
C12: PHM parameter estimation, final model, with all four groups combined 187
D1: Indicators of Multicollinearity, Independence of Errors, and Influential Cases 191
G1: Usage of data set in the empirical works presented throughout the dissertation 201
G2: Data set sample for Chapter 3 analysis, using the EXAKT software 202
G3: Data set sample for Chapter 4 analysis, using the SPSS software 203
xiv
LIST OF FIGURES
1.1: General framework of topics in the dissertation 6, 145
3.1: General framework of approach discussed in this chapter 26
3.2: Relationship between function and expected net revenue 36
3.3: Relationship between and , the left and right side of Eq. (7) 38
3.4: Relationship between terms of Eq. (7) 39
3.5: Combinations of variables of interest, impacting expected net revenue 40
3.6: value of variable of interest to result in profit 41
3.7: Courses of action under different scenarios 67
3.8: Optimal Strategy: Stop Machine 68
3.9: Minimum skill set required to achieve positive net revenue 73
3.10: Optimal strategy under various conditions 86
3.11: Minimum skill set required to achieve positive net revenue 88
4.1: Discussions in chapter 3 focused on failure risk analysis given the 93
machine and operator characteristics and operator assignments to machines
4.2: Discussions in chapter 4 to focus on optimal operator assignment, 94
given machine and operator characteristics
4.3: Quarterly production of two operators, with different projected learning 99
curves
4.4: Learning curve for aggregate analytical skill scores of all operators over 104
all weeks
4.5: Learning curve for aggregate social skill scores of all operators over all 106
weeks
xv
5.1: Typical state space for 127
D1: Scatter Plot of Standardized Residuals for checking assumption of 190
random errors and homoscedasticity
D2: Normal P-P Plot of Standardized Residuals for checking assumption 190
of normality
E1: Relationship between and 192
E2: Depiction of a possible history of state where 193
F1: Relationship between and 195
F2: A possible depiction of history of state where 197
xvi
ABBREVIATIONS
In order of appearance:
HR: Human-Related
DM: Decision maker
PSF: Performance Shaping Factors
CPC: Common Performance Conditions
THERP: Technique for Human Error Rate Prediction
HRA: Human Reliability Analysis
CREAM: Cognitive Reliability Error Analysis Method
MMI: Man-Machine Interface
CIT: Critical Incident Technique
PHM: Proportional Hazards Model
MR: Machine-Related
MTBF: Mean Time Between Failure
MTTR: Mean Time To Repair
AIC: Akaike’s Information Criterion
OA: Operator Assignment
LP: Linear Programming
VIF: Variance Inflation Factor
1
1. INTRODUCTION
There are very few systems that can be operational without the need for human
interaction and decision making. This is regardless of the industry, although the degree of
human involvement may differ depending on the industry. There are advantages and
disadvantages associated with human interaction. Humans are versatile and adaptive;
during non-routine or emergency situations, they can improvise and choose the best course
of action. However, humans can experience boredom, fatigue, distraction, panic, or simply
make the wrong decision. When one is looking for ways to improve the performance of a
human-machine system, serving the needs of the humans, such as training or motivational
programs, can be just as effective as providing superior technology or additional hardware.
Alternatively, when one is analyzing the risk of failure, considering the factors that affect
the rate and extent of human error may be as significant as the failure of various hardware
components.
The effect of humans on the performance of a system can be present in any industry, as
long as a process and a unit of output can be defined. An analyst may consider the role of
nurses and the quality of patient care, the operators on an automotive manufacturing
assembly line, or the drivers of haul trucks in a mining operation. In each case, a skilled and
motivated person, working under the right conditions, is likely to have a positive
contribution to the overall system performance. The opposite is also true: a person who
lacks the necessary skills, is unmotivated, or is working under less-than-ideal working
conditions, may diminish the performance of the system. The discussions in this dissertation
INTRODUCTION
2
are more focused on the effect of humans on failure risk analysis. As such, we focus on
some of the factors that can lead to human error, contributing to failure risk. We will
restrict the discussions to the role of those humans affecting the final system output, not
the managing decision makers or the support personnel. For instance, in the
aforementioned automotive manufacturing assembly line example, our focus is on the
machine operator, not the production supervisor, scheduler, or the maintenance
tradesman.
The work presented in this dissertation is better suited for industries such as
manufacturing, mining, and possibly certain areas of health care, where human error is
more frequent, but the consequences are not catastrophic. The same cannot be said for
certain aspects of the nuclear industry where human error is quite infrequent but upon
occurrence, the consequences may cost lives or extreme environmental damage. In such
cases, we may not be able to put a dollar value to the consequences and they are to be
avoided at all costs. But in the manufacturing example, where the error leads to the
machine breaking down for an hour, or producing less parts over the next shift, being
proactive can be gauged by the amount of savings or additional revenue. A certain amount
of risk is therefore tolerable. It becomes a part of a cost-benefit analysis to determine the
maximum value for eliminating that risk. Given the high probability-low consequence
nature of the suitable environments for our work, the cost benefit analysis has a financial
focus. We aim to maximize the net revenue of the system, rather than a safety or accident
prevention focus which would make sense in a low probability-high consequence
environment.
INTRODUCTION
3
The aforementioned cost-benefit analysis is the focal point of the work we present in
Chapter 3. In Chapter 5, we consider machine production over a planning horizon as a
stochastic process and aim to forecast the production output over the period. In industries
such as the nuclear industry, where human error leading to hardware failure is extremely
improbable or infrequent, a stochastic process is used to predict risk. The system output
over the planning horizon would then be a matter of how much resources we allocate to
the process. We will most likely achieve the output we expect to get. Therefore, the work
we present here is suitable for an industry, such as manufacturing, where human errors
occur frequently, and each error has a non-catastrophic consequence, making the system
appropriately represented by a stochastic process.
In the discussion thus far, we have focused on human-machine systems where human
error is frequent, with non-catastrophic consequences. But our work is not solely about
human error. We can also consider a context where there are varying skill levels among the
human participants and we are interested in how well each individual performs their
assigned tasks. In the aforementioned healthcare example, we may have a group of nurses,
each of whom can provide a certain level of service and attend a number of patients. If skill
levels can be compared and there are some nurses who accomplish more tasks per shift
than others, then we may be able to employ a personnel scheduling technique to utilize less
manpower.
Human error is a large realm to analyze; therefore, there are many factors that can affect
its frequency and magnitude. The factors we have focused on in this dissertation are
INTRODUCTION
4
expertise and working conditions, as pertaining to fatigue. Both of these factors, among
others, affect human performance. The challenges in considering human performance in
general, and expertise in particular, are twofold. The first difficulty is measuring the Human-
Related (HR) factors, such as expertise. For a variable to be used in a business decision
model, it must be quantifiable. Traditionally, HR factors have been described qualitatively.
But if a factor, such as expertise, results in additional production, or less equipment
downtime, that factor ought to be expressed quantitatively. The conversion of this
traditionally qualitative attribute to a quantitative value is a challenge. Even though we do
not make a direct contribution to this quantification process in this dissertation, but we
dedicate Chapter 2 to it, as it plays an important role.
The second difficulty is in the use of HR factors in decision models. While a decision
maker, herein referred to as DM, should be familiar with measurements such as uptime,
availability, and throughput, the same cannot be said about the productivity of the human
participant (herein referred to as operator) interacting with the system. Even after an HR
factor is quantified, a DM may not know how to best use the factor in the decision making
process.
Main Contributions:
The most important finding in this dissertation is to prove the significant effect of
human-related factors on performance of a human-machine system. Performance cannot
be analyzed without considering the factors related to the humans interacting with the
machinery. Chapters 3, 4, and 5 cover the main contributions of this dissertation by
INTRODUCTION
5
discussing performance models that incorporate HR factors to help a DM choose a path that
optimizes system performance. This optimization may be based on any number of factors,
such as revenue or system availability. Chapters 3, 4, and 5 provide further detail
supporting the following list of contributions:
1. We include HR factors as covariates in a proportional hazards model (Chapter 3).
2. We develop a model to enable the DM to perform a cost-benefit analysis to choose the
revenue-maximizing intervention method for reducing the probability of failure
stemming from the operators (Chapter 3).
3. We introduce a method to determine a threshold for system profitability based on the
probability of failure and the expected net revenue (Chapter 3).
4. We devise a method to calculate the minimum level for the HR factors included in the
proportional hazards model in order to ensure system profitability (Chapter 3).
5. We create a methodology to optimally assign operators to machines based on the
sensitivity of the machine to HR factors as well as the operators’ current and forecasted
characteristics (Chapter 4).
6. We use a Markov chain approach to forecast production output, considering operator
learning (Chapter 5).
A more detailed discussion of the list of contributions is provided in Chapter 6.
The general framework of the work presented in this dissertation appears in Figure 1.1:
INTRODUCTION
6
Once the main HR factors are identified and quantified, they can be used as variables in
the decision-making process to optimize system performance. The first type of analysis
considered in Chapter 3 has a short-term scope. We develop a model to measure the failure
risk stemming from HR factors and then provide a cost-benefit analysis for choosing
amongst the various intervention methods to reduce risk. Chapters 4 and 5 examine a long
planning horizon. Chapter 4 is focused on the optimal assignment of the human participants
to tasks they are best suitable for. The goal is to assign the personnel in order to maximize
total revenue. This is done while forecasting the output and learning curves of the various
human participants over the planning horizon. In Chapter 5, we use learnings from Chapters
3 and 4 to perform an analysis that provides a decision maker with an output estimate for
each human participant over a planning horizon. There can be many usages for this analysis,
such as cost-benefit analysis of training programs, personnel assignment, inventory
management, and work scheduling.
Figure 1.1: General framework of topics in the dissertation
Quantification
Analysis of failure
risk: short term
intervention
Human-related Factors
Performance
of human-
machine
systems
Optimal operator
assignment
Long term planning
Production
forecasting: risk of
failure and operator
learning
INTRODUCTION
7
To end this introductory chapter, we define several terms which we have used in this
chapter and will continue to use throughout this dissertation. These terms may be generally
understood but need to be defined for our specific context. They are as follows:
Failure: The condition when the machine is unavailable for production due to an
unplanned event. Within the context of failures caused by the operators, failure is the
consequence of any operator-related mistake that takes the machine out of production.
Risk of failure: In many cases, risk is thought of as consequence multiplied by the
probability of occurrence. But in our case, we simply mean it to be the probability of
failure.
Error: incorrect performance of an operator as required for a particular task.
Intervention: a decision-maker, stepping in proactively, to reduce the probability of
machine failure.
Operator expertise: 1) the ability to operate the machine; 2) the ability to troubleshoot
when a problem is at the operator level and the presence of mind to ask for
maintenance tradesman’s assistance when the problem needs further technical
expertise.
System: short for human-machine system, consists of human resources as well as
physical assets and hardware, working together to achieve the unit of output, that is the
objective of the business establishment.
8
2. QUANTIFICATION OF HUMAN-RELATED FACTORS
Many human-related factors are subjective. A machine is either running or not. But a
person’s ability to perform a task has various levels. In addition, no two people are exactly
the same. However, if we are to use human-related factors in mathematical models for
performance evaluation of systems, we need to turn the qualitative factors into
quantitative ones. Where experts are reliable and readily available, it is quite prevalent in
the literature to find performance evaluation and modeling using expert judgement (Kariuki
and Lowe, 2007; Landy and Trumbo, 1975). However, this is certainly not the only method.
In the next section, we present an overview on some existing and common methods.
2.1. Literature review on quantification techniques
Many factors influence human performance in a complex human-machine system
(Cacciabue, 2000); these factors can be both internal and external to the human operator. A
global view of these factors must include a wide range such as:
1) The human characteristics, such as physical, psychological and mental conditions,
stress and fatigue levels;
2) Working environment and the equipment state, such as operational conditions,
design, maintenance, and availability and reliability of equipment;
3) Managerial and organizational factors, including the safety culture and policy,
management commitment, procedures and training, risk assessment, and incident
Quantification of Human-related Factors
9
analysis. Other important factors relate to the cultural environment, such as the national
culture and the societal values.
The field of human reliability analysis was initially started to perform probabilistic safety
assessments. It was developed to be an approach that considers all possible accident
scenarios in order to probabilistically evaluate the overall system safety (Kim and Jung,
2003). To perform a human reliability analysis, an analyst must identify those factors that
are the most relevant and influential in the jobs studied. The labels used for these factors
differ among different approaches; they are most often called Performance Shaping Factors
(PSF), or other close variations, such as Performance Influencing Factors or Performance
Influencing Context (Kim et al., 2004). In some recent methodologies, their name changes
completely to Common Performance Conditions (CPC) or Error Forcing Context. The basic
difference in these methodologies is the way those factors are used: they characterize the
circumstances under which human actions or tasks take place and they appear very early in
the process of quantification in identifying the PSFs which affect the final outcome of the
Human Reliability Analysis.
Great effort has been given to define potential PSFs in different approaches. Technique
for Human Error Rate Prediction (THERP) is one of the most widely used approaches in
Human Reliability Analysis (HRA), (Swain and Guttman, 1983). It was conceived mainly for
the nuclear industry but it has been applied in other industrial contexts as well. The list of
PSFs in THERP is quite exhaustive. However in the quantification of THERP, and depending
on the application, a very limited number of PSFs is actually used, the most common of
Quantification of Human-related Factors
10
them being the available time, the level of stress, the task type and the level of experience.
Since THERP is the most widely used method and in many cases, the one that served as the
initiating ground for many other methods, its categorization is described in detail.
The categorization of PSFs in THERP includes external and internal factors. External PSFs
are characterized as follows:
1. The situational characteristics:
architectural features,
quality of working environment,
work hours and work breaks,
shift rotation and night work,
availability/adequacy of special equipment/tools,
manning parameters,
organizational structure and actions by others,
rewards, recognition and benefits.
2. The task and equipment characteristics:
perceptual requirements,
motor requirements,
control-Display relationships,
anticipatory requirements,
interpretation requirements,
decision making, complexity,
frequency and repetitiveness,
task criticality,
long and short term memory requirements,
calculation requirements, feedback,
Quantification of Human-related Factors
11
dynamic versus step by step activities,
team structure,
man-machine interface.
3. The job and task instructions:
written and non-written procedures,
written or oral communications,
cautions and warnings,
work methods,
plant policies.
4. The psychological stressors:
suddenness,
duration of stress,
task speed,
task load,
high jeopardy risk,
threats,
monotonous work,
conflicts,
distractions.
5. The physiological stressors:
duration of stress,
fatigue, pain,
hunger,
temperature extremes,
radiation,
G-force extremes,
oxygen insufficiency,
Quantification of Human-related Factors
12
vibration,
disruption of circadian rhythm.
Internal PSFs are categorized into the following factors: previous training and
experience, state of current practice or skill, personality and intelligence variables,
motivation and attitudes, knowledge of required performance standards, gender
differences, physical condition, influence of family, and group identification.
Additional to the external and internal PSFs there may be some ergonomic problems
such as poor design and layout of controls and displays, poor labeling of controls and
displays in the control room, inadequate indication of plant status, presentation of non
essential information and Inadequate labeling outside the control room.
Other Human Reliability methods use frameworks based on the assessment of
interactions and the quantification of their impact on operators’ actions and performance.
A representative methodology of that type is Systematic Human Action Reliability
Procedure, by Hannaman and Spurgin (1984). The objective of this framework is to help in
defining the types of interactions that are important to risk or performance analysis and to
enable the analyst to incorporate them into the system analysis task.
Another family of methods is based on behavioural science and the Rasmussen model
that classifies operators’ behaviour into Skill-, Rule- and Knowledge-based (Rasmussen,
1982). Representative methodologies are Generic Error Modeling System, (Reason 1987),
and Systematic Human Error Reduction and Prediction Approach, (Embrey 1986), which
offer a behavioral classification of human errors. These are techniques that are based on
Quantification of Human-related Factors
13
series of simple questions, usually presented in flowchart form and they tend to determine
whether behaviour is likely to be skill, rule, or knowledge-based.
Of the many methods that have been developed over the years, the one that is best
suited for our type of analysis is Cognitive Reliability Error Analysis Method (CREAM),
(Hollnagel, 1998). Unlike first generation human reliability techniques that were all about
error analysis, CREAM’s focus is on performance prediction. This method can be specifically
tailored to the contextual situation (Stanton et al., 2005).
In the place of PSFs, the CREAM method uses a group of factors, or CPCs, to define sets
of possible error modes and probable error causes. The CPCs provide a comprehensive and
well-structured basis for characterizing the conditions under which the performance is
expected to take place. By using CREAM with its nine CPC families, one can focus more on
the plant-related and people-related factors. The performance prediction process continues
with the use of the taxonomy of Kim and Jung (2003). This work helps the analyst to
recognize important parameters that should be taken into account in each CPC category.
The taxonomy of Kim and Jung allows the analyst to take into account all the possible
performance shaping factors known from the literature and to combine them into a single
and global list.
The CPCs include factors that relate to the person (adequacy of training and experience,
time of the day – Circadian rhythm), to the working context (working conditions, adequacy
of Man-Machine Interface), to the company (adequacy of organization, availability of
procedures and plans), to the specific task (number of simultaneous goals, available time)
Quantification of Human-related Factors
14
and to the team work (crew collaboration quality). The CPC’s are briefly presented as
follows:
Adequacy of organization: defines the quality of the roles and responsibilities of
team members, additional support, organization communication systems, safety
management system, instructions and guidelines for externally oriented activities,
role of external agencies.
Working conditions: describes the nature of the physical working conditions such as
ambient lighting, glare on conditions screens, noise from alarms, interruptions from
the task.
Adequacy of Man-Machine Interface (MMI) and operational support: defines the
MMI in general, including the information available on MMI and control panels,
computerized workstations, and operational support provided by decision aids.
Availability of procedures and plans: describes procedures and plans and includes
operating and emergency procedures, familiar patterns of response heuristics, and
routines.
Number of simultaneous goals: enumerates the number of tasks a person is
required to pursue or attend to at the same time (i.e., evaluating the effects of
actions, sampling new information, assessing multiple goals).
Available time: depicts the time available to carry out a task and corresponds to how
well the task execution is aligned with the process dynamics.
Quantification of Human-related Factors
15
Time of day: denotes the time of day (or night) and describes the time at which the
task is carried out, in particular whether or not the person is adjusted to the current
time (circadian rhythm).
Adequacy of training and experience: describes the level and quality of training
provided to operators to teach new technology, or refresh old skills. It also refers to
the level of operational experience.
Crew collaboration quality: Declares the quality of the collaboration between
crewmembers, including the overlap between the official and unofficial structure,
the level of trust, and the general social climate among crewmembers.
2.2. Usage of Critical Incident Technique
Despite the abundance of literature on HRA methods, the scope of their application is
mostly in high-risk systems, such as nuclear power plants (Elmaraghi et al., 2008). In
addition, most HRA methods were originally developed and used for probabilistic safety
assessment of accidents. There is a lack of research that applies HRA methods in assessing
the probability of errors for direct workers in a manufacturing context (Bubb, 2005). This
gap in the manufacturing context, along with the common practice in the realm of
performance measurement to depend on judgmental indices, leads us to rely on expert
knowledge in assessing the performance of operators. It should be noted that the usage of
performance measurement techniques being discussed is for continuous variables, such as
operator skill. There are other human-related factors, such as being on a night shift (or not)
or certain days of the week, that do not require quantification and can simply be
represented by using indicator variables. At this point, we should define the term “expert”
Quantification of Human-related Factors
16
as we use it in this chapter and throughout the rest of this dissertation. Experts are people
within the system who are aware of the aims and objectives of a given job and who see
people perform the job on a frequent basis. They are in a position to accurately judge the
performance of others.
The HRA methods discussed can provide an analyst with the probability of an event, such
as an explosion or a breakdown. This can be equivalent to what we are pursuing: the failure
of a machine. But this is machine-centric, not operator-centric. The method would provide
the probability of the breakdown of the machine, given the context. The operators working
on the machine are considered as part of the context, along with procedures, MMI, and
many other factors previously discussed. An expert measures the characteristics of the
company, such as “working conditions” or “availability of procedures”, to get an estimate
for the probability of error of a typical operator. These methods may not be sufficient for
analyzing the risk of failure of an individual operator.
We are not always interested in the risk of failure. At times, we may wish to forecast the
production output, in which case, we would be interested in quantifying the ability of how
well the operator can run the machine. We may be interested in the relative ability of one
operator compared to another. Therefore, the HRA methods provide us with a value for the
probability of the machine failure, whereas we are interested in measuring the ability of the
operator in performing the task.
Lastly, it should be mentioned that, even with the HRA methods, experts are involved.
Experts may be used to develop the standard tables, which lead to the probability figures.
Quantification of Human-related Factors
17
Experts develop anchor points within the categories to be assessed to drive our subjectivity.
And experts are used to perform the actual assessments of system characteristics.
In the discussions throughout this dissertation, we use an approach based on the Critical
Incident Technique (CIT) to quantify the technical and behavioural components of operator-
related factors. CIT consists of a set of procedures that are used to collect observations of
the human behaviour deemed to have a critical significance on the phenomenon being
analyzed. Each observed incident must meet certain pre-defined criteria. The CIT analyst
uses the set of observed incidents to solve practical problems (Davis, 2006).
In our analysis, CIT is an appropriate technique to use because it enables us to quantify
technical and behavioural components of factors related to the operators (Levine et al.,
1980; Levine et al., 1983). For us, a critical incident can be defined as a task that ensures the
success or failure of machine operation. Each feature of the technical and behavioural
component may affect the phenomenon differently. We wish to observe each mode and
need to have procedures in place to standardize how we perform the observations. We
need to have pre-defined criteria for assessing how well the operator performs the tasks
expected of him/her.
CIT has various stages: determining what constitutes a critical incident, making
observations from the participants, identifying the significant issues, decision-making on
possible solutions, and evaluation. In our analysis, we only make use of the first three
stages. We use statistics-based models, such as the Proportional Hazards Model, logistic
regression, and Markov chains, to perform the analysis and propose solutions for the
Quantification of Human-related Factors
18
system, replacing the latter stages of CIT. Flanagan, the founder of CIT, mentions having
expert observers as a mode of collecting data (Butterfield et al., 2005; Flanagan, 1954). He
also mentions supervisors and experts in the field as possible observers. One of the ways
Flanagan advocated for data collection is questionnaires filled out by experts. The tools we
use when applying the CIT are questionnaires, observational studies, and technical tests.
The system experts we use, supervisors and manufacturing engineers, complete
questionnaires on technical and behavioural questions about the operators. System experts
also perform observational studies on the operators. Finally, the operators are also involved
as they complete a test that evaluates their technical knowledge. These tools are discussed
next in more detail.
2.2.1. Selection of Experts
Flanagan, the founder of CIT, states that the CIT technique requires observers who are
aware of the aims and objectives of a given job and who see people perform the job on a
frequent basis (Butterfield et al., 2005). System experts are interviewed about their
observations of the critical requirements of the job.
If the analyst decides to include the operators during the application of the CIT, the role
of the operators should be limited to technical tests. Studies have found that self-ratings
were more lenient than supervisory ratings (Conway and Huffcutt, 1997) or external
observations (Davis et al., 2006). Research using self-reports also runs into the problem of
social desirability, so labelled because questionnaire items may prompt responses that will
present the person in a favourable light. As a result, it may be advisable to limit any
Quantification of Human-related Factors
19
assessment type questions to a factual/technical type only. This will limit the tests taken by
the operators to those with right-or-wrong or factual answers and will not contain any
behavioural or performance-prediction questions. Furthermore, similar to self-ratings, peer
ratings also appear to be more lenient than supervisory ratings (Borman, 1974). Thus, it
may be inadvisable to ask operators to rate the performance of their colleagues.
Individuals with more knowledge of the particular job’s details have been found to more
validly predict future performance (Wagner and Hoover, 1974). The management level
closest to the operators can be considered to be experts intimately familiar with the
human-machine system.
Since different raters may have different perspectives on performance that influence
their ratings (Borman, 1974), it may be better to use multiple experts, thereby reducing the
problem of same source variance. Problems arise when measures of two or more variables
are collected from the same respondents and the attempt is made to interpret any
correlation among them. This is the well-known problem of common method variance.
Because both measures come from the same source, any defect in that source
contaminates both measures, presumably in the same fashion and in the same direction.
Podsakoff and Organ (1986) discuss a method for dealing with common method variance
by separation of measurement, which in a way, is to collect the data at different times,
different locations, or through different media. Thus, data collection reported later in this
document was repeated twice within a month, performed at different times of the shift.
Quantification of Human-related Factors
20
The assessment process begins with consulting the system experts to identify the top HR
factors affecting system performance. We do not seek an all inclusive list; Parker et al.
(2001), put forth the discussion that a universal list of factors may be infinite. But we can
start with the initial selection according to the system experts and treat it as an open-ended
process where additional factors may be identified, prioritized and analyzed for a
progressively more comprehensive analysis. However, it would be possible to identify
categories of variables to be adapted and applied differently according to context.
The top HR factors identified can be checked against the list of “key human-centered
factors affecting worker performance” in the comprehensive review performed by Baines et
al. (2005).
2.2.2 Data Collection Tools
Similar to a study by Glick et al. (1986), we use three tools in our job analysis method.
These tools are: observational studies, questionnaires, and self-assessments. Examples of
these tools, as applied to our empirical study, appear in Appendices A1, A2, and A3,
respectively. We use multiple experts to answer questionnaires on performance and
behaviour prediction. We use a different set of experts to perform observational studies on
operators’ technical skills.
Whether one uses questionnaires, observational studies, or any other data collection
tools, there are a number of guidelines that should be followed in writing the individual
questions/items. Statements in the body of the questionnaires and the self-test are to be
kept simple, as short as possible, and with a language familiar to target respondents.
Quantification of Human-related Factors
21
Keeping the questionnaire short is an effective means of minimizing response biases caused
by boredom or fatigue. We should avoid leading questions, as well as those that are
negatively worded, as they may result in biased responses. Issues related to the
homogeneity of items within each variable of measurement (Harvey et al., 1985), as well as
the number of response categories (Ornstein, 1998; Jenkins and Taber, 1977; Lissitz and
Green, 1975), have to be carefully considered in the design.
When the experts are rating an operator, the forced choice scenario is utilized. A study
by Ornstein (1998) refers to other works and states the closed form question to be superior
to the open form. In this scenario, the rater is required to choose from among a set of
alternative descriptors, normally four items, some subset that is most characteristic of the
ratee.
To provide the same frame of reference for the various assessors, we use anchoring.
Using the forced-choice format for the responses, we provide a numerical value to
complement the subjective statements provided in the response. This eliminates the
subjectivity for the respondent when the response may provide an opportunity for different
interpretations. Rather than just stating “low production level” as a response category, one
can state “low production value, less than 100 pieces per hour”. Studies have found that
scale reliability improves with increased anchoring (Lam and Klockars, 1982; Bendig, 1952a,
1952b, 1953).
The test/retest method is used to estimate the reliability (each section of the analysis is
repeated twice). For the various sub-sections, namely operator self-tests, expert
Quantification of Human-related Factors
22
questionnaires, and observational studies, we compare the two separate measurements
and compute a correlation factor.
2.3. Concluding remarks
We describe a general method to quantify HR factors, many of which are often
subjective characteristics. The approach, based on the Critical Incident Technique, is
certainly not the only method available, but it is one that works well in our application of
quantifying the technical and behavioural aspects of human performance when reliable
experts are readily available.
There are many issues at play when using the proposed method, all of which may play a
major role in the results. It is important to select the appropriate subjects as the experts
who will be performing the assessments. The type of tool to use can also play a major role.
Examples of tools introduced in this chapter have been questionnaires and observational
studies. The method of delivery and data collection can be quite important. However, one
of the most important elements is the design of the elements used within the tools, such as
the actual questions (and answers) on the questionnaires. Careful design is necessary for
the wording of the questions, such as avoiding negatively-worded or leading questions. The
design of the answers remains important whether we want to ask open-ended questions or
use a closed-answer design to force the experts to select one of the pre-determined
answers. Lastly, there are techniques to use in the design of the tools to reduce common
method bias. Great care should be exercised when devising the method behind the
quantification process.
23
3. FAILURE RISK ANALYSIS
Consider the scenario where a set of operators with varying personal characteristics
have been assigned to a set of machines. In this chapter, we model changes in the
performance of these machines, given the effect of skill and fatigue, caused by shift work
and different days of the week. We analyze the risk of failure of the machines using the
Proportional Hazards Model (PHM), a common tool in maintenance optimization. The most
common usage of the PHM involves the presence of machine-related factors only. Once we
enhance the usage of this model to include human-related factors as covariates, it would be
natural to wonder whether the model may have any significant managerial impact. In cases
where the human decision maker is responsible for a significant part of the overall risk, we
discuss examples of proactive intervention measures a DM may take to mitigate risk. In
addition, we develop a revenue model that provides a cost-benefit analysis for each
intervention measure considered.
Connection to Previous Chapter
Many HR factors, such as skill, are easily described qualitatively. But if an analyst is to
utilize a mathematical model, such as the PHM, to calculate the risk of failure, the HR factor
needs to be expressed quantitatively. The methods described in Chapter 2 can aid the
analyst achieve this quantification. This is a necessary first step before HR factors can be
used in mathematical and statistical models to help the DM improve system performance.
Main Contributions of this Chapter
1. We include HR factors as covariates in a proportional hazards model.
Failure Risk Analysis
24
2. We develop a model to enable the DM to perform a cost-benefit analysis to choose the
revenue-maximizing intervention method for reducing the probability of failure
stemming from the operators.
3. We introduce a method to determine a threshold for system profitability based on the
probability of failure and the expected net revenue.
4. We devise a method to calculate the minimum level for the HR factors included in the
proportional hazards model in order to ensure system profitability.
3.1. Introduction
The performance of a human-machine system is a factor of the performance of the
hardware, as well as the correct operation of the hardware by the operators. One indicator
of performance is reliability, and to assess reliability, we can perform a failure risk analysis.
There are many reliability and failure risk analysis models that deal with machinery.
However, there are few that incorporate the role of the human operators on uptime and
overall performance.
The first step in our analysis is to have a model to incorporate both Machine-Related
(MR) and HR factors. This can provide us with an all-encompassing assessment of failure
risk. To achieve this, we use the PHM, a commonly used tool to model the time of failure of
equipment (Jardine et al., 1989; Vlok et al., 2002).
Once we evaluate the risk facing the system, we may choose to intervene and reduce or
eliminate it. Any intervention method pursued should have the benefit of reducing failure
risk; it would also have the disadvantage of incurring a direct cost. Given the trade-off
Failure Risk Analysis
25
between risk reduction and direct cost, we develop a revenue model to perform a cost-
benefit analysis for choosing the best intervention method. Given the machine factors, such
as working age; operator factors, such as skill level; and the direct cost of the intervention
method as well as its risk reduction factor, the best intervention method is selected as the
one that results in the highest system revenue. In the absence of an analytical method for
choosing among intervention methods, subjectivity and personal biases enter into the
decision-making process, distancing it from being an evidence-based process. Providing a
systematic tool to a DM to choose the optimal intervention method is the main
contribution of this chapter.
In addition to the cost-benefit analysis, the revenue model can serve in determining the
minimum requirements of various factors. The threshold for making a profit can be
determined and a minimum set of values for the factors exceeding this threshold can be
calculated. For example, the minimum skill level for a new operator can be calculated prior
to certifying him/her to operate a machine. Not achieving this minimum skill level may put
the DM in a situation where, in the absence of a better operator, shutting the machine
down would be the best option.
Figure 3.1 summarizes the aims of this chapter, discussed in this introductory section:
Failure Risk Analysis
26
Figure 3.1: General framework of approach discussed in this chapter
3.2. Literature Review
There has been much research on assessing human reliability and incorporating it into
the overall risk analysis. But the literature is sparse when it comes to performance
measurement models that incorporate human-related risk assessments. Barroso and
Wilson (1999) consider a manufacturing environment and focus on estimating the overall
effect of human reliability. However, their approach is not risk-based, but rather focuses on
identifying sources of human error and reducing them. Horberry et al. (2010) discuss
human factors and their effects on operations and maintenance in a mining context but do
not attempt failure prediction. Similarly, Kolarik et al. (2004) develop a model to monitor
and predict an operator’s performance using a fuzzy logic-based assessment. But the
Quantify
operator skill
Enhance PHM that
includes HR covariates
and interactions
between covariates
Develop revenue model,
using the PHM, to make a
trade-off between risk-
reduction and cost
Compare outcome of various
decisions and choose
intervention method resulting
in highest net revenue
Calculate minimum level of HR
factors that ensure risk
threshold is not exceeded and
system can be profitable
Failure Risk Analysis
27
purpose of their work is to solely provide a human reliability assessment, without providing
any methods for risk reduction. Blanks (2007) discusses the need for improving reliability
prediction, paying special attention to human error causes and prevention, but does not
mention any predictive techniques for human reliability.
Zimolong and Trimpop (1994), and Dhillon and Liu (2006) focus on the maintenance
workforce performing repair work at times when machines are not being used for
production purposes. Reer (1994) discusses human reliability in emergency situations. Our
discussion focuses on the production workforce during the operation of the machines. Our
emphasis is not on decreasing the Mean Time to Repair but on improving the Mean Time to
Failure. A further distinguishing feature of our work is its proposal for managerial
interventions, or proactive measures, to deal with the risk stemming from the operators in
the human-machine system. Our approach is novel and we have not found any similar and
previously published work in the literature. Vrignat et al., (2012), discuss an approach,
where they draw observations from the process to generate an availability indicator to be
used by a DM to plan actions dynamically. The authors also mention the PHM as a tool. Our
work is also helping the DM to plan actions dynamically. A major difference is that, in our
case, the observations from the process include HR factors. Castanier et al., (2003), discuss
a continuously deteriorating machine where each maintenance operation makes sense at
various stages. The DM can choose to run-as-normal, perform preventive repairs, or
preventive replacement. Each has the benefit of improving the system a certain amount
and each has the cost of taking the system out of production for certain duration. There is a
certain element of cost-benefit analysis in this work that is similar to ours. But there is no
Failure Risk Analysis
28
mention of human-related factors either. Neither with the work of Vrignat et al., (2012) nor
with Castanier et al., (2003) is there a deeper focus on the specific causes of degradation.
All causes leading to machine degradation are combined together. The cost benefit analysis
is not related to the specific risk culprits.
There are approaches, such as that of Stewart and Grout (2001), with a focus on error-
proofing techniques and physical devices to prevent human mistakes. Burkolter et al. (2009)
propose personnel selection criteria to minimize risk during the operation. Karaulova and
Pribytkova (2009) acknowledge the human role within the overall reliability analysis but
simply make generalized comments, such as better ergonomic design or improved human-
machine interface, as means of risk reduction. Finally, while Blumenfeld and Inman (2009)
note the impact of inferior operator skill on quality and performance, their scope is limited
to the comparison of systems with or without production management devices. Our scope
is different from all of these in that we propose intervention methods to deal with the HR
risk in the short-term planning horizon. In addition, we supply the DM with a cost-benefit
analysis of intervention methods. There is no other work in the literature that enables the
DM to choose the optimal course of action in maximizing revenue by minimizing failure risk
stemming from the operators in a human-machine system.
3.3. Evaluating Intervention Methods for Human-Related Risk
The PHM is a common tool for failure risk analysis and this is especially true when the
PHM is parametrized using the Weibull baseline (Jardine and Buzacott, 1985). One of the
reasons for the frequent usage of Weibull form of the PHM in this context is the versatility
of the hazard function obtained by varying the scale and the shape parameters (Jardine et
Failure Risk Analysis
29
al., 1989). The PHM relates the time of an event, such as failure, to a number of explanatory
variables known as covariates (Lugtigheid, 2004). Several factors, including the equipment
age or specific system characteristics, may influence the equipment’s hazard rate, which is
the rate of transition out of a non-failed state to a failed state.
The hazard rate can be affected by factors specific to the machine, the environment, or
the human operators within the human-machine system. Traditionally, the PHM has been
used with quantifiable MR covariates. However, this usage can be expanded by including
non-MR type factors (Centrone et al., 2010; Kiassat and Safaei, 2009).
The general form of the Weibull PHM is defined as follows:
. (1)
The hazard rate, h(t), is proportional to the (instantaneous) conditional probability of
failure at time t. The first part of the equation, called the baseline hazard function, is
sensitive to the age of the equipment. It contains two parameters, β and η, the shape
parameter and the scale parameter, respectively. In the absence of covariates, scale
parameter provides the characteristic life of approximately the 63rd percentile of failure
data. When the model includes covariates, η becomes a balancing figure.
The second part includes the explanatory variables, , which influence the hazard
rate. Each explanatory variable, also called a covariate, represents a monitored condition
datum at the time of inspection, t, such as parts per million of iron in the oil sample taken
on a particular day, or the skill score of an operator as measured on a certain date. The
coefficient of covariate determines the covariate’s degree of influence on the
Failure Risk Analysis
30
overall hazard rate. The fact that the model accommodates the inclusion of HR covariates,
in addition to MR factors, makes the PHM an excellent model for our analysis.
We develop a PHM that includes HR factors as covariates so that we can calculate their
effect on failure risk. This can aid us in the calculation of expected uptime as well as the
probability of failure. Both are essential in developing the revenue model that determines
the optimal intervention method for mitigating risk stemming from the operators.
The PHM’s general form in Eq. (1) can be expressed differently (Eq. 1a). The PHM
covariates have a general form and can be the original variables of a measurement, or a
function of the original variables, including their interactions.
, (1a)
where
;
represents the vector of the original variables and
represents the covariate coefficients. The vector dimension, n, may be different from the
number of covariates, m, because there may be additional interaction terms. In some cases,
a covariate may represent an original variable on its own, such as
. In other cases, a covariate may represent an interaction term, such as
.
3.4. Developing an Evaluation Model for Intervention Methods
We can now proceed to develop a general mathematical model, showing expected risks,
costs, and revenues for the next planning horizon, and express it as a revenue function. It
Failure Risk Analysis
31
can be maximized to provide the optimal course of action, resulting in maximum
profitability of the system. The PHM discussed in Section 3.3, and in particular, its covariate
portion, will be a fundamental part of the revenue function to be developed.
There are a few boundary conditions for our approach. We have a number of people
operating several machines; examples are operators of machines in a manufacturing
environment or drivers of haul trucks in a mining operation. The machines are not 100%
automated and random failures may stem from mechanical/electrical issues as well as
human error. Machine’s probability of failure due to MR and HR factors can be measured,
using the PHM or other appropriate failure analysis methods. Failure cause and downtime
duration can be captured, as well as people’s specific involvement with each machine.
Following the failures caused by the operators, repairs are expected to bring the machine
back to an as-good-as-new state (Gasmi et al., 2003). This assumption serves the purposes
of this paper and is made to simplify the discussions. Furthermore, there is a DM in the
system who can intervene by using our proposed method to find the optimal trade-off
between risk and revenue.
The general form of the net revenue function associated with the proposed method is as
follows:
= Revenue–Cost
= – – ,
where
Failure Risk Analysis
32
: net revenue over time interval [t, t+s] where s represents the length of interval and
represents machine age since the most recent repair. is the
length of the planning horizon.
: revenue scheme, per unit time, specific to intervention method
p : number of intervention methods
:
T: random variable representing failure time, measured from the most recent repair
=
: repair and maintenance cost associated with failure
=
: cost function of intervention method
The net revenue function depends on three factors: (1) the machine’s survival up to the
beginning of the planning horizon; (2) the set of PHM covariates, Z; and (3) the matrix of
decision variables, A, resulting from the available intervention methods. The dependency
on these three factors will be explained next.
The various decision variables affect one or more covariates. Their effects can be
expressed as the following matrix:
,
Failure Risk Analysis
33
where the entity is the effect of the ith decision variable on the jth covariate. The number
of decision variables depends on the number of intervention methods available to the DM
and is independent of the number of original variables.
We assume that the overall effect of different decision variables on each covariate is
additive. For each covariate, , the cumulative effect of the various decision variables is
represented by .
The function, ψ, is an update to the covariate part of Eq. (1a). It is a function of the
covariates and the decision variables and takes the following general form:
, (2)
where
is the type of decision variable that affects the hazard rate as a whole, not any individual
covariate(s) directly.
After replacing the covariate part of Eq. (1a) with the extended form of , the final
form of Eq. (1a) is as follows:
. (3)
We can now calculate the conditional expected value of the net revenue function:.
E
,
In general, is normally calculated as
, but in the case of , the value has an
upper bound of a. Therefore,
can be considered as two parts, a continuous part when ,
giving us
, and a discrete part when , giving us . As a result,
. By using integration by parts to the previous formula, we arrive
Failure Risk Analysis
34
at
. We apply this concept to to express the previous equation as
the following:
E
, (4)
where is the conditional reliability:
.
The variable u is introduced to represent the time of failure and to emphasize that is a
function of the upper bound of the interval .
We make a simplifying assumption here to help us with the calculations. Each
changes over time; but since we consider short intervals, we can think of this to be a step
function, thus constant over the short interval [t,t+s]. The result of this assumption is the
following:
, .
For the baseline cumulative hazard,
, and
, and
, and given and above, Eq. (4) is expressed as
follows:
, (5)
The expected uptime and the probability of failure are functions of Z and A.
Eq. (5) is the expected net revenue function and includes the revenue scheme, as well as
the cost, associated with each intervention method. A DM can use this function as a cost-
Failure Risk Analysis
35
benefit analysis to choose the intervention method that results in the highest net revenue
for the system.
It should be noted that in cases where the PHM only uses MR factors, we can calculate
an optimal threshold for the hazard rate based on the expected cost of preventive
replacement, expected cost of failure replacement, and expected cycle length (Vlok et al.,
2002). When the hazard rate exceeds this pre-determined threshold, it alerts the
maintenance decision-maker to take proactive steps, as there may be a high probability of a
functional failure. The nature of these proactive steps depends on whether the risk is
related to the machine hardware or the person operating it. In the decision model we
propose in this chapter, we do not use a threshold. We use a cost benefit analysis that
includes a “do nothing” decision, among others. If the DM’s intervention is beneficial to the
system, it gets implemented. If the risk is low and insignificant, the “do-nothing” approach
is warranted.
3.5. Determining the minimum skill level
Working with Eq. (5), we can determine the minimum skill levels necessary for the
operators if we are to avoid planning horizons with non-positive net revenues.
From Eq. (5), we can see that when increases, decreases.
More risk results in a greater number of failures, in turn, leading to less net revenue. There
should be a minimum risk level, , above which we will no longer be profitable (Figure
3.2). Therefore, if , then . The first challenge is to
find .
Failure Risk Analysis
36
Figure 3.2: Relationship between function and expected net revenue
First challenge: finding the profit threshold
We consider the case , and set Eq. (5) to zero to calculate . Since we are
considering the expected profit for a fixed set of decision variables, we denote the revenue
and the direct cost functions as and , respectively. We use the original definitions of
and for the baseline cumulative hazard.
, (6)
This analysis can take two different scenarios. We will first consider the scenario where
there are no direct costs for decision variables, . The non-zero scenario will follow.
, (7)
$
Failure Risk Analysis
37
Let
, and
.
One can think of as the normalized benefit of an initiative. Similarly, one can
think of as the normalized cost of an initiative.
The term can be treated as a variable; it is the risk value for a given scenario. Other
terms, t, s, CF, and are also constants.
The function starts at s, when , that is , and has an asymptotic
lower bound at zero when , that is . Negative values of are not
considered as the function represents risk. is a monotonically decreasing
function when increases.
starts at zero, when , that is , and has an asymptotic upper
bound at
as , that is
. It is a monotonically increasing function
when increases.
The monotone nature, and the fact that the lower bound of one function is below the
upper bound of the other, ensure a unique solution for . Figure 3.3 shows the
relationship between and . Values less than result in positive net
revenue.
Failure Risk Analysis
38
Figure 3.3: Relationship between and , the left and right side of Eq. (7)
We now consider the scenario of having direct costs associated with decision variables,
.
The function is unchanged from scenario 1. The function , when , is
equal to
. This leads to two cases, depending on the relative size of the two terms. As
,
. This term does not play a role in finding a unique solution for .
o Case 1:
. This results in no solution for as shown in Figure 3.4a. We can
interpret this as a very large direct cost of intervention methods, one that
overshadows the benefits. As these intervention methods will be very costly, they will
not be adopted by the DM.
o Case 2:
. This ensures a unique solution for , as displayed in Figure 3.4b.
Value of
Fun
ctio
n V
alu
e
S
0
Failure Risk Analysis
39
3.4a: Case 1: no solution exists for 3.4b: Case 2: unique solution exists for
Figure 3.4: Relationship between terms of Eq. (7)
Second challenge: finding the minimum skill level
Thus far, we have determined the threshold for the expected net revenue function to be
positive. Going a step further, we now determine the minimum skill level to ensure we do
not exceed that threshold and that the system is profitable.
If , we need to have , or
(8)
It should be noted that even though is related to the decision variables, it does not, by
definition, directly impact any variables or covariates. Since we are interested in solving
inequality 8 for a specific variable, such as analytical skill, we keep on the right side.
Consider an example where , and we are only interested in variables
x2 and x3. Let us also assume there is a total of four significant covariates in the PHM and
that only the first three contain our variables of interest, and : , ,
and , so inequality 8 reduces to the following:
Fun
ctio
n V
alu
e
0
Value of
Fun
ctio
n V
alu
e
Value of
0
Failure Risk Analysis
40
If we express the above inequality as , then Figure 3.5 shows the
influence of and on expected net revenue. Points falling on the line represent zero
expected profit.
Figure 3.5: Combinations of variables of interest, impacting expected net revenue
If we assume and to be the type of variables whose improvement affects risk
negatively, for example, Analytical skill, then and are both negative. This will result in
the swapping of the positive and negative regions in Figure 3.5.
If we are interested in only one variable, the above diagram changes from a region of
interest to a single value (Figure 3.6). Consider the above example, but let us assume we are
only interested in x2.
If we express this inequality as , then if ,
. If ,
.
X3
X2
Positive profit
Negative profit
Failure Risk Analysis
41
Figure 3.6: value of variable of interest to result in profit
Therefore, whether we are interested in a single variable, or several variables, we can
use the mathematical model we have developed to determine a minimum level for the
factor(s) of interest to ensure the system would always be in a profitable state.
3.6. Failure Risk Analysis – An Empirical Study
The facts of the case study as they pertain to the quantification process are discussed
next. There is an automotive manufacturing company in Ontario, Canada, which has been in
business for at least the last six decades. This manufacturing plant has traditionally
produced either automobile engines or components for engines or transmissions. As a
result, the hourly workforce, hereafter referred to as operators, have either worked on the
engine assembly lines or have been involved with machining processes. The work on the
engine assembly line is manual, fast, and repetitive. Once the operator has mastered the
sequence of tasks, there is very little cognitive work or decision making involved. In
contrast, machining work is slower and much more cognitive. The majority of the operator’s
tasks involve periodically gauging the product at various stages of the process and making
offsets to the machines accordingly. The operator interfaces with the machine where tasks
such as calculating the amount of offset and entering the value can be highly cognitive.
X2
Positive profit when
Positive profit when
Failure Risk Analysis
42
Within this manufacturing plant, a new department has started its operations. We shall
refer to this department as Alpha and it will be the focal point of our analysis. This
department produces four types of precision gears, Driven, Drive, Pinion, and Ring, on four
independent machining lines. The flow of one machining line does not affect the other
three. The products are shipped to another plant to be assembled into vehicle
transmissions. The four gears produced at Alpha are among the most complicated gears of
the entire transmission; the manufacturing tolerances are extremely tight. To date, the
manufacturing plant’s bulk of activities have been in engine manufacturing; it has had very
little experience with gear manufacturing. Therefore, most operators have not had any
exposure to the general area of gear manufacturing in their previous years of experience
with the company.
All the machines in Alpha are new and are recent purchases. The focal point of our
analysis is one type of machines at alpha which we shall refer to as Kappa. There are four
Kappa machines, one for each product line. The Kappa machines are used to grind and hone
the surface of the gear teeth to the right dimensions, with extremely tight tolerances. The
four Kappa machines are almost identical; the only difference is the external tooling for the
gear they produce. These machines are far more complex than all other machines used in
the department. This complexity, along with the nearly 100% utilization rate of the Kappa
machines, translates to a much higher occurrence of breakdowns, compared to the other
types of machines used in Alpha. The Mean Time Between Failure (MTBF) of Kappa
machines is low, and their Mean Time To Repair (MTTR) is high. For example, for the Ring
machine, the MTBF and the MTTR are 132.3 and 7.3 hours, respectively. Given the fact that
Failure Risk Analysis
43
the normal production week consists of 120 hours of the Kappa machine running 24-hours-
per day Monday through Friday, and that each shift is 8 hours, the breakdown figures are
almost as bad as losing one shift each week.
Once the installation process was complete, staffing was started. The jobs were posted
and all operators within the plant were eligible to apply. No personnel selection process
was used due to union rules. The staffing was done gradually over several months as the
production output was ramping up. The plant currently employs about 3,000 hourly
employees. Due to layoffs over the last few years, the operator with the least amount of
seniority has about 18 years of experience with the company.
For the hourly workforce, the final staffing roster at Alpha consists of all-male personnel,
between 45 and 60 years old; all with at least 20 years with the company, but none with
any previous experience with the specific machines used in Alpha. The specifics of their
experiences within the plant differ: one or two operators have had some experience with
gear manufacturing; others have had no experience with gear manufacturing, but have
experiences with general machining processes; and finally some with no machining
experience at all, having worked only on engine assembly lines.
When the operators transferred in to Alpha from other departments within the
company, none received any machine-specific training at Alpha. There is an improperly-
implemented buddy-system (Swanson and Sawzin, 1975) in place whereby each new
operator enters, spends two weeks paired with one of the more experienced operators,
and observes his activities. There are no guidelines for the trainer and there is no
Failure Risk Analysis
44
certification for the trainee. After the initial two weeks, the novice operator is assigned his
own machine and is expected to learn the rest of the job on his own. This two-week
duration is constant for all new operators, regardless of their aptitude and learning rate.
There is no internal check on reaching a certain skill level prior to working on the machine
independently. This is an important point that will be discussed later in Section 3.6.6,
scenario 5.
In general, training in the manufacturing environment is more effective when a separate
training facility is available (Bluhm, 2001). In such a setting, the operators can learn by trial
and error without being afraid of causing damage or disruptions. At Alpha, the operators
get trained right on the production machines where the product demand is never reduced.
The operator performing the training is also operating the machine; this means the training
is second priority for the trainer as the expectation is still there for him to meet the
production quota. Furthermore, a “buddy system” is a weak form of training as there is no
guarantee that the trainer has the sufficient skill set and is good at transferring knowledge.
These reasons make the training at Alpha to be inferior to a structured on-the-job-training
(Sisson, 2001). Simply having the knowledge does not qualify one to properly train another.
The products are in high demand and the weekly production quota is high. This is good
news for Alpha as it has led to the department being fully staffed to run three shifts per day,
around the clock. On each shift, there are 4 operators to run the Kappa machines on each of
the four production lines. The operators are assigned to product lines; they do not switch
lines in the duration of our study. This set-up, along with very low personnel turn-over in
Failure Risk Analysis
45
the department, means that as time goes on, each operator becomes more familiar with
the product he produces as well as the particular Kappa machine to which he is assigned.
This is likely to lead to an increase in expertise.
One simplifying assumption we make in the case study analysis in this Chapter as well as
the subsequent ones is that all parts produced can be sold. Therefore, if improved operator
expertise results in lower machine breakdown and a higher production output, the system
gains additional revenue. We take a binary perspective where if a part is produced, it is of
acceptable quality. Parts that are of low quality are considered non-salable and are not
included in the production volume count. Therefore, in this sense, acceptable quality of
parts is incorporated into our data modeling. In some future work, we may consider
different counts for acceptable versus scrap parts produced by an operator and include this
as a factor in risk analysis or production forecasting.
Unlike the assignment to the product lines, the operators are not assigned to shifts.
There is a shift rotation on a weekly basis. Therefore, an operator who is on “Days” this
week will be on “Nights” next week and “Afternoons” the week after, before returning to
“Days” in three weeks. When the operators are on the afternoon or night shift, they may
experience inefficiencies due to disruptions in their Circadian rhythms. Circadian rhythm
refers to the fluctuations of one’s physiological conditions governed by the Earth’s day-and-
night cycle (Wickens et al., 2004).
Failure Risk Analysis
46
3.6.1. Quantification of skill and shift work
System experts at Alpha identify the following HR factors as the most significant:
operator skill and the effect of shift work, or the disruption on the operator’s Circadian
Rhythm.
1. Machine operator skill: This is the ability to operate the machine and to trouble-
shoot. We have decided to assess operator skill in terms of three components. To look
at expertise or skill as one variable may be an over-simplification, resulting in the loss
of some information. The study by Blau and Kahn (1996) expresses “skill” in terms of
two components, experience and education. Following their lead in breaking skill
down into components, we express operator skill using the following three
components:
a) Experience level: Similar to the model developed by Blau and Kahn, we adopt the
usage of “experience level” in our analysis. Familiarity with machining processes, in
general, and knowledge in gear manufacturing, in particular, are likely to assist an
operator with his daily activities.
b) Analytical skill: We indirectly adopt another term used by Blau and Kahn:
education. All operators in Alpha have a high-school diploma as the highest level of
formal education. Therefore, it is pointless to have education level as a
distinguishing factor. However, we adopt the idea behind formal education. A
positive result of education is the acquisition of technical or analytical skills. As
Failure Risk Analysis
47
these are normally gained through formal education, on-the-job training/learning,
we adopt “analytical” as the second component of skill.
While experience level and analytical skill are slightly related, analytical skill
refers to expertise on specific machines, whereas Experience measures familiarity
with the general gear manufacturing environment. Agrote et al, (1995) mention an
interesting study at Lockheed where even though 2,000 “green” employees were
put through a job-specific 4-week training program, the productivity suffered
greatly as a result of the inexperience of these employees. The authors go on to
mention the importance of experience in the general area, in addition to job-
specific training, when the tasks are complex.
c) Social interaction: Based on the study of Soller (2001), we can see a correlation
between overall skill and social skills in the form of interaction with peers and/or
supervisors. The interaction is referring to the willingness to ask for help from
others and to offer help to others. Operators can benefit when they share ideas
and give and receive help; help-giving can benefit even high achievers (Blumenfeld
et al., 1996) The correlation between operating the machine well and the
operator’s social skills makes intuitive sense. In a hypothetical scenario, two novice
operators have similar experience levels and initial analytical abilities. During the
first few weeks, they are both overwhelmed by their work and are unsure of many
of their machine operation tasks. One operator is shy and would try to
troubleshoot on his own. As such, the machine would likely remain down for a
longer period of time. The second operator is outgoing and can easily ask his more
Failure Risk Analysis
48
skilled colleagues for help. Effective collaboration with peers has proven itself a
successful and uniquely powerful learning method (Brown and Palincsar, 1989). It
is likely that the performance of the latter will be better, at least in the initial
stages.
2. Shift work: Various studies have established links between common shift-work
decision variables with physiological factors related to fatigue (Knauth, 1996; Kostreva
et al., 2002; and Hsie et al., 2009). The shift-work decision variables found to be the
most important in influencing fatigue and Circadian rhythm disruptions are shift
duration, starting time, direction of shift rotation, and distribution of days off.
Disrupted Circadian rhythm may affect operating and/or troubleshooting abilities of
machine operators, thus diminishing performance, increasing the risk of error, and
reducing the detection of anomalies.
It is interesting to note that in addition to the general expectation of the detrimental
effects of shift-work, when the operators of Alpha are interviewed, we find that many
actually enjoy working the night shift. This is because there is only one person around
from management; that person is the production supervisor. The work environment is
relaxed, and there are no interruptions of any kind. Therefore, in the specific context of
Alpha, there are advantages and disadvantages to shift work. The overall effects should
certainly be investigated.
We think of machine operator expertise as a score, ranging between 0 and 100 and
quantify it using CIT. Shift work, on the other hand, can be represented by binary variables.
Failure Risk Analysis
49
We can enquire whether or not an operator is working on a certain shift, and assign a “1”
for the affirmative answer, and “0” otherwise. In such cases, to differentiate between n
categories, we need n-1 binary variables. In the case of shift work, to capture the presence
of an operator on day-, afternoon-, or night-shift, we need two binary variables, x1 and x2,
where one can be set as baseline; x1= x2=0 represents the day shift where no adverse
effects of Circadian rhythm disruptions are felt by the operators.
In quantifying operator expertise, we use three tools, namely questionnaires,
observational studies, and self-tests. This method is similar to a study done by Glick et al.
(1986), on the relationship between job characteristics and three attitudinal outcomes:
effort, general satisfaction, and challenge satisfaction. They accomplished this by obtaining
reports from three separate data sources: interview incumbents, card sorts by job
incumbents, and observations by trained observers.
As our system experts, we used production supervisors and manufacturing engineers.
The CIT requires observers who are aware of the aims and objectives of a given job and who
see people perform the job on a frequent basis (Butterfield, 2005; Flanagan, 1954).
Production supervisors and manufacturing engineers make up the management level
closest to the operators and can, therefore, be considered to be experts intimately familiar
with this human-machine system.
A production supervisor and a manufacturing engineer filled out questionnaires covering
questions on performance and behaviour prediction. We used another production
supervisor and another manufacturing engineer to perform observational studies on
Failure Risk Analysis
50
operators’ technical skills as they relate to performance. During the observational study,
the expert spends two hours with each operator to observe his every move. The study takes
place when the machine is running and when there is an expectation of a tool change on
the machine. In this case, the expert is able to observe the operator during machine
operation, tool change, and product measurement. We use two different sources of
experts, manufacturing engineers and production supervisors to gain different perspectives
on performance and also to reduce the problem of same source variance. To avoid, or
reduce the problem of common method bias, we have used two sources and our data
collection is repeated twice within a month, performed at different times and locations.
Lastly, we asked the operators to fill out self-tests that cover three areas. The first one is
related to their experience level. It is important to know the knowledge background of the
individuals in terms of what they know about gear manufacturing. The second area of the
self-test is technical questions, asking very specific questions on the operation of the
machines, part quality, or tool change. The final part of the self-test covers some questions
regarding social interaction. These are meant to gauge the likelihood of the operator
seeking advice from others when stuck in an unfamiliar situation.
Operator assessment data include two experts filling out questionnaires, two experts
performing observational studies, and the operator himself taking a technical test. Each of
the expert assessments are repeated twice. Therefore, at each quarter, there are four
questionnaires, four observational studies, and one technical test for each operator. There
are three quarterly assessments done in the duration of our empirical study, making it a
Failure Risk Analysis
51
total of 27 assessment points per operator over the nine-month duration of the empirical
study. Given the total number of 12 operators who are assessed in the nine months, we
have a total of 324 operator assessment data points.
Questionnaires, observational studies and self-tests for this case study are designed
specifically for this application but we have followed all the guidelines discussed in Section
2.2. These include the questionnaire/study length, design of the questions, and the types of
responses.
We should note we do recognize that just because an operator has a higher skill level, it
does not necessarily translate into a higher productivity level (Bendoly and Prietula, 2008).
Operators may be unmotivated to apply themselves to the limits of their ability. From the
perspective of the operator, this motivation is needed as working close to the highest skill
level may not be preferable. However, skill level does affect productivity in two ways: 1) by
defining the maximum performance possible, thus moderating the relationship between
effort levels and actual performance, and 2) by moderating the relationship between
intrinsic motivation factors and effort level. There is work, such as Hancock (1986) in human
factors, and Bendoly and Prietula (2008) in operations management, that discuss an
inverted U relationship between the level of effort and the desire to apply the effort. The
inverted U shape is warranted by the combination of two effects: 1) stimulation provided by
a higher effort, avoiding boredom and monotony; and 2) consequence of a higher level of
effort, such as discomfort and fatigue. This combination of positive and negative effects
enables us to find the most preferred level of effort at the top of the inverted U shape.
Failure Risk Analysis
52
Bendoly and Prietula (2008) state that higher skill levels can affect the inverted U
relationship by shifting up the most desirable effort level. As an operator’s skill is improved,
the most desirable effort level increases.
We are not analyzing skill in combination with the important factor of motivation. In
Section 3.7, we state motivation to be an HR factor to consider in our future works. Despite
this fact, since we have established that higher skill positively affects operator productivity,
our discussions are still valid. Our emphasis on training and higher skill has a positive
contribution to the productivity of a human-machine system.
3.6.2. Analyzing the risk of failure
A set of operators with varying skill levels and physical tolerances to disruptions caused
by shift-work have been assigned to a set of machines. We aim to model the performance
of these machines, given the effect of skill and shift work. We perform a failure risk analysis
on the machines using the PHM. The first step is to determine if any human-related
covariates are significant and are added to the PHM. If there are significant HR covariates,
we proceed to ascertain the managerial impact. We consider two intervention measures
the manger may take in this context and develop a revenue model that provides a cost-
benefit analysis for each intervention measure.
In developing the PHM, we use a software tool, called EXAKT, designed specifically for
proportional hazards modeling in industrial settings (Jardine et al., 1997; Jardine and
Banjevic, 2005). This software is developed by the Centre for Maintenance Optimization
Failure Risk Analysis
53
and Reliability Engineering (C-MORE) to model the data and determine the significance of
the covariates in the development of a PHM.
We want to investigate the important question of whether the HR factors identified by
the experts play a major role in Alpha’s equipment MTBF. The initial observations seem to
indicate a significant shift-to-shift difference in terms of the frequency and duration of
machine failures, resulting in a difference in production output. In addition, some operators
seem to operate the machines better than others. These operators are more in tune with
the machine and can detect abnormalities through sound and/or product measurement
trends. The results of our preliminary analysis on the failure data for the Ring line Kappa
machine and the three operators assigned to it during the first three months of operation
are shown in Table 3.1. There seems to be shift-to-shift and operator-to-operator
differences when we compare the number of failures to the number of shifts worked. There
is quite a difference among the three operators working on this machine. The worst
operator has had three times as many failures as the best one (9 compared to 3). In terms
of shift-to-shift differences, two shifts are identical and much higher than the third shift (7
failures compared to 4). The sample size is small at this point; however, there is evidence to
warrant a closer investigation using more data.
Failure Risk Analysis
54
Shift
Total Percentage 1 2 3
Operator
1 22 , 1 * 20 , 0 20 , 2 62 , 3 4.8
2 19 , 3 22 , 1 21 , 2 62 , 6 9.7
3 21 , 3 20 , 3 22 , 3 63 , 9 14.3
Total 62 , 7 62 , 4 63 , 7 * Total number of shifts, followed by failure occurrences
Percentage 11.3 6.4 11.1
Table 3.1: Initial Shift-to-Shift and Operator-to-Operator Differences in Failure Occurrences
Based on the historical data and failure times, we take the initial step of obtaining an
age-based hazard function for the Kappa machines. We analyze the first three months of
data, where there are 187 shifts, 18 failures, and 2 calendar (also called administrative)
suspensions (one due to a production stoppage, and another one at the end of the data
collection period). The full dataset, including the segment used for this initial analysis, as
well as the upcoming analysis for the rest of Chapter 3, and subsequent Chapters, appear in
Appendix G. We also provide a description of the various usages of the data set for the
various parts of this dissertation.
Failure is a consequence of any operator-related mistake that takes the machine out of
production. The failure mode is not captured beyond this. We treat the product line as a
control variable. For the sake of simplicity, we are showing the model for only one of
Alpha’s four product lines, Ring gear. This will sufficiently demonstrate the points we are
making about the inclusion of HR factors into the PHM and the managerial impact
discussion that follow.
The initial model focuses on the age of the machine and contains no covariates:
Failure Risk Analysis
55
0239.0
1505.123505.123
9761.0)(
tth , (9)
where time is measured in hours since the last restart. The working age of the machine is
reset to zero after each failure that takes the machine out of production.
The shape parameter, β, in Eq. (9) is not significantly different from 1. This is confirmed
by β’s p-value as shown in Table 3.2. This, in turn, tells us the machine is currently
experiencing a constant hazard rate. Therefore, there is no evidence that the failures are
age-dependent. We are analyzing a complex system as a whole and not its individual
components. Since there can be many failure modes playing a role, assuming there are no
dominant ones, then any single repair, or component replacement, will not affect the
failure risk of the entire system.
The other parameter of this age-based model, the scale parameter, η, provides the
characteristic life of about 123 hours for this machine. This is the time at which the
probability of failure is 63.2%.
Parameter Estimate Standard Error P-Value
Scale 123.505 31.54
Shape 0.9761 0.1763 0.89
Hypothesis: Shape parameter = 1 not rejected, based on 5% significance level
Table 3.2: Summary of Estimated Parameters
The above analysis is repeated in EXAKT while fixing the value of Beta to 1. In this case,
hours and this value represents the mean life of the machine.
Since the failures are random and follow an exponential distribution, in the absence of
covariates, the model cannot help with decision making. We will enhance this analysis by
Failure Risk Analysis
56
developing the hazard function into a PHM that fits the data (verified by a significant K-S
test) and with significant covariates. We will then be in a position to use the PHM to predict
failures, aiding the decision maker in enhancing maintenance activities.
We analyze the historical failure data captured for all shifts over the last nine months
on a per-machine basis. Data from various Kappa machines are not aggregated and are for
the Ring gear line only. However, to investigate the effect of HR factors, we aggregate
same-machine data for the assigned operators. The data set has 966 records, consisting of
53 failures and 4 calendar suspensions. We match the shift failure data and whether or not
the failure was human-related, with operator attendance records. As a result, the data
include the working age of the machine, whether or not a failure occurred, a shift identifier,
and the operator’s three skill component scores.
The first two input variables as potential covariates are the two indicator variables,
(X1,X2), representing the three shifts. Day shift, afternoon shift, and night shift are
represented by (0,0), (1,0), and (0,1) respectively. The next three input variables, X3, X4, and
X5, are the three skill components. Each is expressed as a score out of 100. Skill assessments
are repeated quarterly, or three times in our nine-month analysis of Alpha, resulting in nine
sets of skill components. For informational purposes, univariate and bivariate statistics on
the three skill components, including mean and standard deviation, as well as correlations,
are shown in Table 3.3. The values for the skill components shown here are for the same
three operators whose skill values were shown in Table 3.1. However, the values in Table
Failure Risk Analysis
57
3.3 are based on the entire nine months of analysis, and are therefore based on 9 sets of
data points for each operator.
Skill Component
Variable Min Max Mean Standard Deviation
Pearson Correlation Coefficient
Analytical Experience Social
Analytical X3 19.76 95.85 66.15 23.65 1.00 0.85 -0.33
Experience X4 14.29 100.00 69.72 27.50 1.00 -0.71
Social X5 53.13 88.54 66.05 12.33 1.00
Table 3.3: Univariate and Bivariate Statistics on Skill Components
We make a simplifying assumption for the skill scores. We acknowledge that each skill
score is unlikely to remain static over time and that on-the-job-learning is likely to have a
positive effect on operator expertise. We assume this dynamic nature to occur in a step-
wise manner. The planning horizon we consider is short-term (one 8-hour shift) and as a
result, assuming a constant skill score throughout the planning horizon makes sense.
Therefore, the covariates in the PHM for this interval are time-independent.
In addition to the five aforementioned variables, we also consider the pair-wise
interactions. There are no established guidelines for the levels of interactions to consider;
however, in our case, considering pair wise interaction is sufficient, in terms of interpreting
the results. In general, when it comes to considering interactions beyond the second level,
the analyst should make a judgement based on the context. If higher level interactions
make intuitive sense, they should be considered as well.
To develop our model, we primarily use the backward selection method but complement
this method with the Akaike’s Information Criterion (AIC) to help avoid bias in the model
fitting process (Burnham and Anderson, 2004). It provides the analyst with a trade-off
between accuracy and complexity. The AIC score for each model uses the maximum
Failure Risk Analysis
58
likelihood estimator for the model as well as the number of variables included in the final
model. At each stage, we consider all variables with p-values above 0.1 to be potential
candidates for elimination. Initially, this results in a large decision tree. But many of the
possible branches end with the same set of variables. At the end of the process, of the six
unique models, we choose the one with the lowest AIC.
We obtain the following PHM:
, (10)
where t is measured in hours. Table 3.4 shows the EXAKT output resulting in the PHM
estimation.
Covariate Parameter Estimate Standard Error P-Value
- Scale, η 119.3 60.8
- Shape, β 1.043 0.106 0.69
z1= X2 1 21.35 6.225 < 0.01
z2= X1 X5 2 0.0366 0.0093 < 0.01
z3= X3 X4 3 -0.0003 0.0001 < 0.01
z4= X2 X3 4 0.222 0.0574 < 0.01
z5= X2 X4 5 -0.227 0.060 < 0.01
z6= X2 X5 6 -0.291 0.0845 < 0.01
Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level
Table 3.4: Summary of Estimated Parameters for Ring gear
In working with the variables, we have not standardized them and therefore, they take
on values of differing magnitudes. For example, z1 represents the night shift and takes on a
value of 1, when we are analyzing the night shift. In contrast, z3 represents the interaction
of two skill components, both of which are expressed as a score out of 100. The coefficients
of z1 and z3 in turn reflect the different magnitudes and compensate for them.
Failure Risk Analysis
59
A simple hazard function, such as the one expressed earlier in Eq. (9), fits the data set
(confirmed by a non-rejection of a Kolmogorov-Smirnov (K-S) test, p-value 0.41). But since
the failures are random and follow an exponential distribution, in the absence of covariates,
the model cannot help with decision making. We are able to enhance this analysis by using
the PHM to predict failures, aiding the decision maker in enhancing maintenance activities.
The PHM obtained has significant covariates and fits the data. This statement about the
model fit is based on the non-rejection of a K-S test, p-value 0.08. This p-value is greater
than the reject threshold of 5% but obviously we would prefer to see it in the higher ranges.
In the discussions that follow in Sections 3.6.6 and 3.6.6.3 and the equations obtained,
equations 14, 15, and 16, the p-values are all 0.2 or greater. The same decision-making
process followed in this Section is followed in those two Sections, but based on a larger
data set, with models that have a better fit, according to K-S tests, with higher p-values.
As general commentary on the usage of K-S test for model fit, we should say this is the
most common test done for the PHM. In using EXAKT to build the model, the software uses
the Wald test to test if the parameters are significant. It uses the K-S test to check for model
fit for the entire model. This usage of the K-S test is combined with the Cox-Snell residuals
test. Another possible test that may be used is as an alternative to the K-S test is the
Schoenfeld residuals test (Crowder, 2012).
The validity of the failure prediction by the obtained PHM can be confirmed by using
another tool, for example logistic regression, to achieve the same goal of predicting the risk
of machine failure. This is discussed in more detail in Appendix B; the highlight of this
Failure Risk Analysis
60
discussion is that the two methods generally prompt the DM to take similar actions in the
face of operator-related risk.
3.6.3. Odds estimates
Odds estimates are calculated from the ratio of the occurrence of an event to its non-
occurrence,
. In our context, odds estimates can provide a DM with an
idea of the degree of labour intensiveness, or the level of automation, of the system. A
system that is highly automated will likely experience a low impact of human-related risk;
therefore, small odds ratios point to the fact that HR factors would likely not play a
significant role on system performance. In such an automated system where MR factors are
the dominant ones, financial resources may be better spent on capital expenditures than
human resource initiatives, such as training programs. The DM will not have to have a large
focus on mitigating the risk stemming from the operators. Therefore, the odds ratios can
serve as a good initial point.
The PHM developed for the system under analysis can be used to calculate the odds for
an operator, given a set of factors, such as a specific shift and a certain machine age. The
PHM is used to calculate the odds for the second operator using the same set of factors but
changing the specific HR factor being analyzed, such as the skill of the second operator. The
odds ratio can then be obtained by calculating the ratio of the odds of the two operators.
Odds estimates are calculated for Alpha to provide the DM with an idea on the role of
operator skill as well as the effect of disrupted Circadian Rhythm (shift-work). The odds
Failure Risk Analysis
61
estimates are given in Tables 3.5 and 3.6. Considering the dayshift and a machine age of 40
hours, we compare operators with highest and lowest skill components. The corresponding
odds ratio, comparison of the odds of the two operators, is 21. This points to a large
difference where the worst operator is 21 times more likely to cause a failure compared to
the best operator. In turn, this points to the large role operator skill can play in this system.
It can also point to possible large gains that may be realized as a result of operator training.
Condition Odds Odds Ratio
Worst operator Best Operator
Worst operator Skill components: (20,14,73) 0.063 1 21
Best Operator Skill components: (96,100,53) 0.003 1
Table 3.5: Odds Estimates
Using the mean skill component values in Table 3.2, we consider the average operator
and calculate odds estimates across three shifts. We keep machine working age constant at
40 hours. Odds ratio between afternoon and day shift is about 12. This value drops to 5 and
2.5 when we compare afternoon- to night-shift, and night- to day-shift, respectively. The
odds and the odds ratios are shown in Table 3.6.
Condition Odds Odds Ratio
Day shift Afternoon shift Night shift
Average operator Day shift 0.015 1
Average operator Afternoon shift 0.183 12.2 1
Average operator Night shift 0.038 2.5 0.21 1
Table 3.6: Odds Estimates
On a macro level, the information above provides the DM with some valuable
information. The first is the potential large gain in training programs. The large odds ratio of
Failure Risk Analysis
62
0.063/0.003=21 between the worst and the best operator can present itself as an
opportunity to reduce the risk in the system significantly by simple training programs
targeted towards operators at the lower end of the skill spectrum. Another benefit a DM
may gain from the above information is for preliminary planning purposes. The DM may
decide to assign additional maintenance people, or the more skilled maintenance staff, to
the afternoon shift to deal with the additional risk that seems to be present on this shift
compared to the other two. Overall, the large odds ratio for both operator skill and the
effect of shift work presents the DM with the knowledge that HR factors may play a
significant role on system performance.
3.6.4. Evaluation model for intervention methods
We propose two intervention methods for Alpha: reduction of the production rate and
addition of a highly-knowledgeable person (hereafter referred to as a Guide) on shift. An
example of such a Guide would be a company representative for the machine
manufacturer. The two aforementioned intervention methods are not the only HR
intervention methods available to a DM. We use these two approaches as examples
because they may be applicable to many systems including the considered case study.
Given these two intervention methods, the possible scenarios for running the machine are:
(1) running the system at the regular production rate with no Guide; (2) running the system
at a fraction of the regular production rate to provide the operator with a longer decision
time; (3) adding a Guide to assist in proper task completion by the operator; and (4) using
both approaches together, adding a Guide and running the system at a reduced rate.
Failure Risk Analysis
63
Each intervention method considered by the DM has a decision variable associated with
it in the mathematical formulation. As an example, the decision variable associated with
adding a Guide is a binary variable that will be equal to 1 when the Guide is present and
zero otherwise. Each value of the decision variable will affect expected uptime and
probability of failure differently.
We follow Eq. (5) to obtain the expected value of Alpha’s revenue function. The terms of
Eq. (5) are updated and defined as follows for Alpha:
: a binary variable to determine whether or not we change the production rate;
when production rate is unchanged.
: a binary variable to determine whether or not a Guide is present
: production rate, ranging between 0, stopped machine, and 1, machine running at regular
rate of 100%
: production volume per hour
P: profit per part
, where is the cost of having a Guide on a shift.
For the net revenue part of the function, the expected uptime hours are multiplied by
the production-per-hour of the operator, the profit-per-part, and the production rate.
Production-per-hour is calculated as a moving average of the hourly production of the last
Failure Risk Analysis
64
M-shifts of a particular operator. For the second part of the function, an operator’s average
failure cost, CF, is calculated by taking the total cost of all HR failure modes that occur
during this operator’s last M shifts, divided by the total number of failure occurrences
during the same period. The third part of the function is the additional cost of a Guide per
shift.
The ψ function associated with Alpha’s revenue function is as follows:
. (11)
Explaining the terms related to : Terms are taken directly from the decision variable
matrix. The two terms and represent the effects of the second decision variable,
the addition of a Guide, on the third and fourth covariates. The other covariates are not
affected and the reasons will be discussed shortly. The term is a function of two
variables: . The term is previously defined. The next variable, ,
measures the effect of the presence of a Guide on an operator’s skill. When there is a Guide
available for the operator to consult with, the operator is likely to commit fewer mistakes.
This has the equivalent effect of the operator having a higher analytical skill score. The
presence of a Guide will not affect the experience level or social interaction components of
skill. As a result, only covariates and , which are functions of Analytical skill, are
multiplied by . One way to determine the correct value of k is to have an initial value
based on expert knowledge and then refine this value with more data. Shifts with and
without Guides can be compared to determine the reduction in hazard rate. This can be
equated to a gain in analytical skill if all else remains fixed.
Failure Risk Analysis
65
Explaining the terms related to : The ψ-function in Eq. (11) is also dependent on ;
for Alpha, this decision variable is a function of production rate, r, and has the form
. The rate, r, ranges between zero, for a stopped machine, and 1, for running the
machine at the regular rate of 100%. When we consider Eq. (3), the form of ensures the
overall hazard rate is zero if . A stopped machine has no risk of failure. The value for
determines the effect of the reduction in hazard rate when the production rate is reduced.
Empirical evidence can be used to determine . To this end, various production rates can be
implemented while keeping other factors constant. As this approach may be impractical,
expensive, or difficult, we may expert knowledge elicitation, similar to the approach
proposed by Zuashkiani et al., (2009). In this case, a rate is proposed, such as 50%, and the
corresponding reduction in hazard rate is estimated by the experts. The hazard reduction
value is then used to calculate the correct value associated with .
Alpha’s general revenue function is expressed in terms of probabilities, expected values,
and the specifics of the PHM obtained. In the following numerical example, we analyze the
two modes of HR-interventions; they consist of the combinations of presence or absence of
a Guide ( or , respectively) and running production at full-speed, partial-
speed, or not running at all ( , , and , respectively). This is presented in
Table 3.7. The scenario where we stop the machine and add a Guide is not considered as it
would never be the optimal course of action. We calculate the revenue for each of the
remaining five cases and choose the one with maximum revenue as the optimal course of
action.
Failure Risk Analysis
66
Full production rate,
r = 1
Half production rate,
r = 0.5
Stop machine,
r = 0
Add Guide, √ √ √
Do not add Guide, √ √ X
Table 3.7: Combinations of intervention methods to be considered
Various hypothetical sets of inputs are considered in order to show the possibility of
different optimal solutions. All scenarios take place on the afternoon shift, {X1=1, X2=0}. The
machine’s working age is taken to be t=40 hours, and the analysis is done for an 8-hour
shift. The effect of adding a Guide is equivalent to a 50% rise in analytical skill, i.e. k=0.5.
The operator’s production count and failure cost are calculated based on the last 20 shifts
worked.
Scenario 1: A novice operator with low analytical skill of 40 is scheduled to operate the
machine. Failure cost is calculated to be an average of $4,000 per incident; the operator’s
average hourly production is 33; the cost of adding a Guide is $300 for one shift; profit per
piece is $40; and a 50% reduction in production rate results in a 15% reduction in hazard
rate. As can be seen from Figure 3.7, the optimal course of action is to add a Guide to help
the operator.
Scenario 2: the optimal course of action changes when the cost associated with failure is
halved from $4,000 to $2,000 and the cost of adding a Guide is doubled to $600 for a shift.
All other factors are kept the same as scenario 1. As can be seen in Figure 3.7, the optimal
course of action is to accept the risk and run at normal rate, with no Guide.
Failure Risk Analysis
67
Scenario 3: Most input parameters are kept the same as scenario 1. The difference is the
presence of an operator with the high analytical score of 90 on shift. His average production
volume is 38 pieces per hour. Contrary to scenario 1, the optimal course of action is to run
at normal rate and add no Guides (Figure 3.7). This scenario shows us the difference a
skilled operator makes. All conditions are kept constant except Analytical skill which has
changed from 40 to 90. In this case, we need no Guide, thus saving $300 for the shift.
Scenario 4: Differences from scenario 1 are two-fold. Firstly, the product has a low
profit-per-piece of $5. Secondly, a 10% production rate reduction results in a 40% reduction
of hazard rate. In this scenario, the optimal course of action is to run at the partial rate of
90% (Figure 3.7).
Scenario 1: Optimal Strategy: Add a Guide Scenario 2: Optimal Strategy: Run as Normal
Scenario 3: Optimal Strategy: Run as Normal Scenario 4: Optimal Strategy: Reduced Production Rate
Figure 3.7: Courses of action under different scenarios
9199
4495
0
4512
9447
0
3200
6400
9600
(1,0) (0.5,0) (0,0) (0.5,1) (1,1)
Rev
enu
e
Values of r, rate, and g, Guide: (r,g)
9442
4702
0
4337
9294
0
3200
6400
9600
(1,0) (0.5,0) (0,0) (0.5,1) (1,1)
Rev
enu
e
Values of r, rate, and g, Guide: (r,g)
11064
5475
0
5288
10984
0
3700
7400
11100
(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1)
Rev
enu
e
Values of r, rate, and g, Guide: (r,g)
724 790
0
651 661
0
280
560
840
(1, 0) (0.9, 0) (0, 0) (0.9, 1) (1, 1)
Rev
enu
e
Values of r, rate, and g, Guide: (r,g)
Failure Risk Analysis
68
Scenario 5: This is an exaggerated case where failure cost is portrayed so high
($120,000) as to make all options negative. All other input factors remain the same as
scenario 1. The optimal decision is to take the machine out of production for a shift until a
more skilled operator is available (Figure 3.8).
Figure 3.8: Optimal Strategy: Stop Machine
This scenario complements the discussions in Section 3.5. We are in a situation that the
operator skill is so low that the best course of action is to shut down the machine until a
more skilled operator becomes available. This may not happen until the next shift when the
next operator comes in for duty. This translates into a lot of expensive capital expenditure
sitting idle. Fixed costs are still accruing while we are not making any revenue.
As mentioned in the case description, Alpha’s current departmental policy is for a new
operator to be trained for two weeks prior to independently operating a machine. This
arbitrary duration of two weeks has no scientific basis. It is unrealistic to expect one
particular training duration to be appropriate for everyone, regardless of the trainer or the
trainee. It may be more practical to expect the operator to possess a minimum skill set
before being assigned to a machine on his own. This minimum skill level may come from
-9777 -11682
0
-5201
-2010
-11700
-7800
-3900
0
(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1) R
even
ue
Values of r, rate, and g, Guide: (r,g)
Failure Risk Analysis
69
our mathematical model developed in Section 3.5 so that the company will never run into a
situation such as scenario 5 discussed above. Operator risk should never be high enough to
justify a stopped machine as the best course of action.
Finding the profit threshold
The following analysis comes from Eq. (6), developed in Section 3.5:
For the two cases (r,g)=(1,0) and (r,g)=(0.5,0), this equation is revised to the following:
The intervention measures only include production rate changes but not the addition
of a Guide. Therefore, there are no direct intervention costs, .
For the remaining two cases (r,g) = (0.5,1) and (r,g) = (1,1), the equation is revised to
the following:
, (12)
The intervention measures include changes to the production rate as well as the
addition of a Guide.
We use the values provided in Scenario 5 to calculate the profit threshold for each of the
four cases:
Case 1: (r,g) = (1,0)
Failure Risk Analysis
70
The integral has no closed form and as a result, we will use numerical methods to solve
for . We know that due to the monotonic nature of the function, there is only one
unique solution for that will make the whole function equal to zero. Table 3.8
displays the step-wise values used for and the resulting value for the function. We
use a simple bracketing technique where we find the root in the interval (a, b) if f(a)
and f(b) have opposite signs. We change the values until we get to within three decimal
points of zero. The end result is .
Value of Value of function
0.9 -35
0.1 0.56
0.2 -4.6
0.15 -2
0.12 -0.5
0.11 0.04
0.112 -0.06
0.111 -0.01
0.1105 0.016
0.1107 0.005
0.1108 0.0004
0.11079 0.001
0.11081 -0.00007
Table 3.8: Values of and the corresponding function value
Failure Risk Analysis
71
For the remaining three cases, the only terms of Eq. (11) that differ from Case 1 are the
failure cost,
, and the intervention cost,
.
Case 2: (r,g)=(0.5,0)
and
.
We follow the same numerical methods procedure described for case 1 and obtain
.
Case 3: (r,g)=(0.5,1)
and
.
By using numerical methods,
Case 4: (r,g)=(1,1)
and
.
By using numerical methods,
Table 3.9 summarizes the values of for the four cases. Value of represents the
maximum value our function can take before we get into a “loss” scenario. Therefore,
when all else is equal, case 1 represents our best hope for running profitable since it allows
the largest amount of risk to be present. In a way, case 1 provides the most flexibility.
Failure Risk Analysis
72
Case (r,g) Value of
(1,0) 0.11080
(0.5,0) 0.05739
(0.5,1) 0.02143
(1,1) 0.04087
Table 3.9: Value of for the four cases
Second challenge: finding the minimum skill level
We have established in Section 3.5 that we can have a positive net revenue when
. In our context, we are interested in an operator’s skill set, consisting of all
three of his skill components, that satisfies the inequality , where the values
for come from those calculated for the four cases above. In the context of our aim to
determine the minimum operator skill set, only case 1 needs to be analyzed further. This is
because case 1 has the highest value for . Therefore, the minimum skill set required for
the other cases will have to be even higher than this case in order to comply with the lower
risk level allowed under these cases.
We use the specific model developed for Alpha:
.
The covariates of interest are those that contain the variables analytical skill, ,
experience level, and , social interaction. The covariates that contain these three
variables are displayed in Table 3.10. In scenario 5, we consider the afternoon shift.
Therefore, and .
Failure Risk Analysis
73
Covariate Covariate definition
Table 3.10: original variables contained in covariates
Case 1: (r,g)=(1,0) and .
We take the natural logarithm of both sides; and since , we have
, (13)
Only those combinations of skill components, ( that will satisfy the above
inequality will result in positive net revenue. Otherwise, the DM should choose to stop
the process to avoid an expected loss. This is displayed graphically in Figure 3.9.
Figure 3.9: Minimum skill set required to achieve positive net revenue
X3 X4
X5
Positive net revenue
No revenue 6875
Failure Risk Analysis
74
Consider the case of novice operators. When an operator first transfers into the
department, his “experience” score is beyond the control of the DM. It is only dependent
on how much previous exposure the operator has had in the general area of gear
manufacturing. This leaves the DM with two variables, Analytical skill and Social Interaction.
If this particular operator is introverted and is assessed low on his social interaction skill,
then the DM should make sure the operator gets additional technical training to ensure a
high Analytical skill score to offset the other two low scores. The training needs of this
operator would be prioritized over another novice operator who is assessed high on his
social interaction score.
This is a demonstration of the practical aspect of the models we have developed. In
this example, the inclusion of HR factors is shown to directly affect decisions to ensure
profitability of the system in the short term.
3.6.5. Expanded data set, additional factors
In the analysis thus far, we only looked at one of the machining lines within Alpha and
considered the operator skill components as well as shift work. Our data set came from one
machine and included 966 records and 57 events. We can increase the size of our data set
by combining three machining lines. This increases the data set to 3049 records and 130
events. The additional data records reduce the variability in the data. There can be
theoretical disadvantages in increasing the size of the dataset. But this can only be
computational and applicable to extremely large datasets, which is not the case with ours.
Failure Risk Analysis
75
Each machine is treated as a control variable as we use two binary variables to represent
the three machines. In addition, we consider an additional factor: day of the week. We
divide the week into three segments, Monday, Friday, and the rest, and represent these
three with two binary variables. Our thought process here is that the first day may be less
productive and more error-prone than the rest of the week because the operators are just
returning from the weekend and have not gotten into full rhythm yet. This is in line with the
work performed by Williams (2004) where he finds the average time worked per person is
very similar on Tuesdays, Wednesdays, and Thursdays, but lower on Mondays, and lowest
on Fridays. This may be especially true in the case of those operators whose turn it is to
work on the night shift. Whilst the first night can result in the greatest impairment in
performance (Lamond et al, 2004), adaptation of sleep and performance can occur as the
week progresses.
Our thought process for considering the last day of the week on its own is that as the
week goes on, fatigue builds up and by the time an operator gets to the last day of the
week, he/she is more tired and consequently, less effective/productive/alert. The other
thought would have been that the operator on the night shift would get adjusted to the
night shift hours and perform better as the week goes on. Therefore, the nightshift
performance on Fridays would be better than on Mondays.
The combinations of the night shift and the first or last shift of the week are examples of
interesting two-way interaction terms that are considered in this expanded analysis. The
resulting model is as follows:
Failure Risk Analysis
76
, (14)
where t is measured in hours. The variables X1 and X2 are representing the three shifts; V1
and V2 are representing the beginning, middle, and the end of the week; and finally, Y1 and
Y2 represent the three machines. Table 3.11 adds further detail to Eq. (14).
Covariate Parameter Estimate Standard Error P-Value
- Scale, η 0.063 0.14
- Shape, β 1.086 0.08 0.27
z1= Social 1 -0.128 0.04 < 0.01
z2= Analytical 2 -0.138 0.05 < 0.01
z3= X1 3 2.485 0.47 < 0.01
z4= X2 4 1.427 0.48 < 0.01
z5= Y1 5 -1.561 0.34 < 0.01
z6= Y2 6 -1.813 0.33 < 0.01
z7= V1 7 6.402 1.56 < 0.01
z8= Social Analytical 8 0.0017 0.0007 0.014
z9= X2 V1 9 1.77 0.53 < 0.01
z10= Social V1 10 -0.1009 0.0257 < 0.01
Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level
Table 3.11: Summary of Estimated Parameters
This model adequately represents the data set; when we perform a K-S test on the
model fit, the p-value is 0.2 and as a result, the model fit is not rejected. This model has the
lowest AIC compared to any other models obtained using the same set of main effects and
two-way interactions.
Given the fact that there may be high correlation between some of the variables, such as
that between experience and analytical skill, how do we ensure we do not have a
multicollinearity problem? Once again, as explained with the significance testing of the
individual variables, this is something that is dealt with by EXAKT during the model building
Failure Risk Analysis
77
process. Two variables may be highly correlated and as a result neither may be significant.
Each one on its own is significant and this would then prompt a judgement call on which
variable to exclude. This was the case with Eq. (14) where Experience and Analytical skill
could not be used in the model together as they were highly correlated. We had seen early
signs of this when we had looked at the data for the Ring line (Table 3.3). As we will discuss
next, only one of these two variables could remain in the PHM.
3.6.5.1. Discussion on the model’s terms
It is necessary to interpret the presence or absence of the terms in equations 14. Among
the nine main effects considered, seven are present and two are not:
o Experience: this variable is absent in the model. It is highly correlated with
Analytical. The partial correlation coefficient is 0.88; the effect of the third variable,
Social, is controlled for. The model can work by including either Experience or
Analytical, but not both. When both are included, neither is found to be significant.
Therefore, we use our expert judgement to include Analytical and exclude
Experience. We believe Analytical to be the more important variable. As an operator
becomes more experienced, he typically gains more knowledge, translating to
Analytical skill, or the ability to operate the machine. This is the general belief
according to the works in the literature (Ash and Levine, 1985; Quinones et al.,
1995). But this is not always the case, or it may have diminishing effects over time.
More experience does not always result in higher abilities. There are some early
studies that conclude work experience not to be as successful for job performance
Failure Risk Analysis
78
as previously thought (Fiedler, 1970). In a study by Hunter and Hunter (1984), a
correlation factor of 0.18 was found between work experience and job performance.
o Social: This variable is not highly correlated with Analytical or Experience. The partial
correlation coefficients are 0.572 and -0.498 with Analytical and Experience,
respectively. Therefore, its presence in the model is not surprising. The covariate has
a negative coefficient, 1, which makes intuitive sense: the higher the social skill, the
less the risk posed.
o Analytical: This variable is found to be significant with a negative coefficient. This
makes intuitive sense: the higher the analytical skill, the less the risk posed.
Therefore, training programs can reduce the risk posed by operator error or inability
to operate machine.
o X1 and X2: These are the two binary variables representing afternoon and night shift.
They are both significant and both have positive coefficients. This indicates both to
be worse (riskier) than day shift. Interestingly, afternoon shift poses more risk than
night shift. As mentioned in Section 3.6.1, when the operators of Alpha have been
interviewed, many actually enjoy working the night shift due to the fact that there
are hardly any members of management present. The work environment is quite
relaxed and there are no interruptions of any kind. The better working conditions
(from the perspective of the operators) may offset some of the adverse affects of
the disrupted circadian rhythm.
Failure Risk Analysis
79
o Y1 and Y2: These are the two binary variables representing the Driven and the Drive
machines. They are both significant and both have negative coefficients. Therefore,
the Driven and Drive lines run better than the Ring line.
o V1: This is a binary variable representing “first day of the week or not”. It is found to
be significant, with a positive coefficient. The first day of the week is worse than the
rest of it. An operator’s productivity may be reduced on Mondays as he/she needs
to be reoriented after two days away from the work process. Another possibility is
that operators may lack motivation on a Monday because this day is furthest from
the next available day of rest or leisure (Bryson and Forth, 2007).
o V2: This binary variable, representing “last day of the week or not” is absent in the
model. The original thought behind including this variable was that as the week goes
on, fatigue accumulates and by the last day of the week, the operator is more tired
and consequently, less productive or alert. An opposing thought would have been
the operator on the night shift gets adjusted to the night shift hours and performs
better as the week goes on. Therefore, the nightshift performance on Fridays would
be better than Mondays. Neither one of these thoughts seems to apply to Alpha.
3.6.5.2. Procedure for developing the model
In working with EXAKT to select a final PHM, there are numerous ways one could go
about creating a model. Our first attempt was similar to the approach we took earlier in
Section 3.6.3. We started with 35 variables (9 main effects and 26 two-way interaction
terms) and used the backward selection model for the most part. This approach normally
Failure Risk Analysis
80
involves an iterative process where, at each stage, we eliminate the variable with the
largest p-value for its coefficient. We continued this process until we were left with only
significant variables. However, we combined this backward selection process with AIC
where we considered a few different paths and selected the model with the lowest AIC. The
best model we obtained from this approach had 14 covariates and an AIC of 36.4.
In another attempt, we considered grouping the variables as our initial step. We started
analyzing each group on its own, and followed a backward selection process to determine
the significant variables within the group. We then took the significant variables from each
group and combined them into a large list. But this large list was smaller than the 35
variables that we had begun with in the previously described approach. The best model we
were able to obtain from this approach was an improvement over the last model; it has 13
covariates and an AIC of 33.4. Step-by-step details of this procedure are provided in
Appendix C.
The final approach used to obtain the model represented by Eq. (14) involved the
forward selection method. Each main effect was considered on its own first and only one,
V2, is found to be not significant. Experience is found to be significant; however, it is highly
correlated with Analytical and both cannot be used in the model together. For the reasons
mentioned above in Section 3.6.5.1, it is excluded from the model-building process. The
interaction terms that did not involve the two eliminated main effects were then
considered on their own and 6 out of 9 were found to be significant. The ones found to be
not significant were Analytical-X1, Analytical-X2, and Analytical-V1. We then used our expert
Failure Risk Analysis
81
knowledge of the system to decide on the order of importance of these remaining
variables. We used a forward selection model to add the six interaction terms one at a time,
starting with the most important one, considered to be “Social-Analytical”. Three out of the
six interaction terms were found to be significant in the final model. The resulting model is
an improvement over the last model obtained; it has 10 covariates and an AIC of 24.3. The
aforementioned methods are summarized in Table 3.12.
Method Used Number of
Covariates
AIC
Start with all main effects and pair wise interactions. Use backward selection method,
complemented with AIC
14 36.4
Place variables into groups of similarity; within each group, perform a backward
selection method, complemented with AIC. Combine all selected variables from each
group and use backward selection method, complemented with AIC.
13 33.4
Consider each main effect on its own first. From the main effects that are found
significant, consider the pair wise interaction terms individually. Prioritize the main
effects and interaction terms and follow a forward selection method, starting with the
most important main effect and ending with the least important interaction term.
10 24.3
Table 3.12: Model Selection Method and Results
As can be seen, we have gone beyond the method described earlier in Section 3.6.2 and
found ways to develop a better model. In spite of the fact that the model development in
Section 3.6.2 is valid and attempts to drive out some bias, but the methods described here
result in a model with a lower AIC. Given the fact that AIC is a method for trading off
accuracy and complexity, the lowered AIC figure indicates an improvement over the
method we have previously used.
Failure Risk Analysis
82
3.6.5.3. A more parsimonious PHM
The PHM in Eq. (14) contains 10 covariates. Given the fact that the data set contains 130
“events” which is not a large sample size, and in order to make this a more parsimonious
model, we repeat the analysis, this time ignoring the interactions. The result is the following
PHM, which now contains 7 covariates instead of the 10 appearing in Eq. (14):
, (15)
Table 3.13 adds further detail to this PHM.
Covariate Parameter Estimate P-Value
- Scale, η 3.029
- Shape, β 1.09 0.2445
z1= Social 1 -0.06766 < 0.01
z2= Analytical 2 -0.02845 < 0.01
z3= X1 3 2.442 < 0.01
z4= X2 4 2.175 < 0.01
z5= Y1 5 -0.8779 < 0.01
z6= Y2 6 -1.374 < 0.01
z7= V1 7 0.9752 < 0.01
Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level
Table 3.13: Summary of Estimated Parameters
Given the sample size of N=130, one may question the testing of the variables for
significance. This is a difficult question in PH modeling and there is no standard answer to it.
The testing is certainly sensitive to the number of failures and suspensions out of the total
number of events. Less failures result in larger standard errors which may result in the
variable not being found significant. Even with a low total number of events, or a low
number of failures, modeling can still be done but the low number affects the standard
Failure Risk Analysis
83
error of each variable. The testing of the model and the rejection of variables as significant
depends on the sample size and is reflected in the model building process.
In this newly obtained PHM in Eq. (15), the shape parameter is not significantly different
from 1. Therefore, we repeat the modeling, and fix the shape parameter to 1. In the
resulting PHM, machine age does not play a role; hazard rate is dependent on the level of
the covariates only.
. (16)
Table 3.14 adds further detail to the PHM in Eq. (16).
Covariate Parameter Estimate P-Value
- Scale, η 2.511
- Shape, β 1 (fixed)
z1= Social 1 -0.06503 < 0.01
z2= Analytical 2 -0.02855 < 0.01
z3= X1 3 2.49 < 0.01
z4= X2 4 2.21 < 0.01
z5= Y1 5 -0.8652 < 0.01
z6= Y2 6 -1.301 < 0.01
z7= V1 7 0.9767 < 0.01
Hypothesis: Gamma (cov) = 0 tested, based on 5% significance level
Table 3.14: Summary of Estimated Parameters
Of the original 9 main effects, 7 appear in Eq. (16). This is similar to Eq. (14) which
contained the same 7 main effect variables. Therefore, the discussion on model terms in
Section 3.6.5.1 is still valid. We test the goodness of fit of this model with a K-S test. The p-
value is 0.48 and as a result, the model fit is not rejected. Similar to our work in Section
3.6.2, we compare the results of this PHM with a logistic regression for validation purposes,
Failure Risk Analysis
84
the details of which appear in Appendix B. The highlight of this discussion is that there is a
strong correlation between the two approaches and this serves us in our original purpose of
using one approach to validate the other.
3.6.5.4. Revenue Function and Discussion
Given the PHM in Eq. (16), the expected revenue function remains the same as the
preceding Section (3.6.5). However, the function associated with Alpha’s revenue
function is as follows:
.
Terms in are taken directly from the decision variable matrix. The term represents
the effects of the first decision variable, the addition of a Guide, on the second covariate.
The presence of a Guide will only affect the analytical skill component. As a result, only
covariate which contains the original variable Analytical skill, is multiplied by .
Similar to the preceding Section, we consider various hypothetical sets of inputs to show
the possibility of different optimal solutions. All scenarios take place on the first afternoon
shift of the week, and . We analyze an eight-hour shift on
the Ring machine, , at a working age hours. Adding a Guide has an
effect equivalent to a 25% rise in analytical skill, . Operator’s production count and
failure cost are calculated based on his last 20 shifts.
Scenario 1: An operator with skill scores of {Analytical=64.39, Experience=72.49,
Social=62.44} is scheduled to operate the machine. Average failure cost is $2,000 per
Failure Risk Analysis
85
incident; the operator’s hourly production average is 33; the cost of adding a Guide is
$800 per shift; profit per piece is $40; and a 50% reduction in production rate results in
a 15% reduction in hazard rate. As can be seen from Figure 3.9, the optimal course of
action is to run at regular rate but to add a Guide to help the operator.
Scenario 2: the optimal course of action changes if the profit per piece is $10. All other
factors remain the same as in scenario 1. As can be seen in Figure 3.9, the optimal
course of action is to accept the risk and run at a normal rate, with no Guide.
Scenario 3: All input parameters remain the same as scenario 2, except the failure cost
which is increased to $7,000 per incident. As Figure 3.10 shows, the optimal decision is
to take the machine out of production for a shift until a more skilled operator is
available.
Scenario 1: run at rate, add a Guide Scenario 2: Run at rate, no Guide
0
1000
2000
3000
4000
(1,0) (0.5,0) (0,0) (0.5,1) (1,1)
Pro
fit
Values of r, rate, and g, Guide: (r,g)
-400
-200
0
200
400
(1,0) (0.5,0) (0,0) (0.5,1) (1,1)
Pro
fit
Values of r, rate, and g, Guide: (r,g)
Failure Risk Analysis
86
Scenario 3: Stop Machine
Figure 3.10: Optimal strategy under various conditions
In scenario 3, we are in a situation where the best course of action is to shut down
the machine until a more skilled operator becomes available. This is obviously an
undesirable situation and the following discussion helps us avoid being in such a loss
scenario.
For the two cases (r, )=(1,0) and (r, )=(0.5,0), intervention methods include
production rate changes but not the addition of a Guide. Therefore, there are no direct
intervention costs, and .
For the remaining two cases (r, )=(0.5,0) and (r, )=(1,0), due to the presence of a
Guide, we have an additional cost, .
Using Eq. (6) and the values in scenario 3, we calculate the profit threshold for each case:
Case 1: (r, )=(1,0) results in an integral which has no closed form; as a result, we use
numerical methods to solve for . We perform a numerical analysis with step-wise
values used for until we get to within three decimal points of zero for the function
value. The result is .
-1000
-750
-500
-250
0
(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1)
Pro
fit
Values of r, rate, and g, Guide: (r,g)
Failure Risk Analysis
87
Case 2: (r, )=(0.5,0): Following the same numerical methods procedure described for
case 1, .
Case 3: (r, )=(0.5,1): There is no solution for . This is an example of the no-answer
scenario, described in conjunction with Eq. (7), and displayed in Figure 3.3a. Therefore,
given the circumstances of the scenario, such as the failure cost, and the particular
effects of r and , this shift cannot be profitable, regardless of the operator skill set.
Case 4: (r, )=(1,1): As in case 3, there is no solution for .
As established in Section 3.5, we can have a positive net revenue when .
We are interested in an operator’s skill set (Experience, Analytical, and Social) that satisfies
the inequality , where the values for come from those calculated for the
four cases above. Case 1 has a higher value for . Therefore, the minimum skill set
required for case 2 will have to be higher than this case. Since it is the minimum skill set
that we are interested in calculating, we will not make an effort to calculate it for case 2.
Similarly, we will ignore cases 3 and 4 since we established there can be no skill set that
would yield profitability, given the conditions of cases 3 and 4.
In scenario 3, we consider the first afternoon shift of the week, on the Ring machine.
Therefore, and . Here, the function as applicable to the
current analysis is as follows, with further details provided in Table 3.15:
.
Covariate
Variable represented Social, Analytical,
Table 3.15: Variables represented by the covariates in the function
Failure Risk Analysis
88
Given (r, ) = (1,0) and , we reach the following inequality:
.
Only those combinations of skill components, ( that will satisfy the above
inequality will result in positive net revenue. Otherwise, the DM will choose to stop the
process to avoid an expected loss. An example of a skill set resulting in profit is
(80,80); however, skill set (80,70) does not result in a profit and would prompt
the DM to stop the machine. This is displayed graphically in Figure 3.11. It is also interesting
to note that when takes on its maximum value of 100, is calculated to be 71, in order
for the system to be profitable. The graph tells us regardless of the value of , we will not
be profitable when the social interaction score is below . Similarly, when , is
calculated to be 31. This tells us the analytical skill score has to be greater than 31 for the
system to be profitable, regardless of the social skill score.
Figure 3.11: Minimum skill set required to achieve positive net revenue
The cost of failure in scenario 3 is quite high compared to the profit per part. Therefore,
a high operator skill set is required to make a profit. Given this calculated minimum skill
level, and if none of the other factors, such as the average failure cost, can be improved, the
100
100
80
80 70
“No profit” region
Infeasible region
Profitable region
Failure Risk Analysis
89
management at Alpha must follow a regimented training procedure. Upon an initial
assessment of his social interaction, the operator must receive machine-specific training to
have a sufficiently high analytical skill score to enable him to be in the positive-revenue
region. Otherwise, there may be shifts when the machine sits idle due to an expected net
loss.
3.7. Concluding Remarks and Future Work
We have discussed a more comprehensive approach towards the usage of the PHM by
including human-related factors. At times, the human part of the human-machine system
may be a significant source of risk. As such, factors associated with the human operator
must be considered in the reliability analysis. One challenge in this process may be the
quantification of human-related factors. Thus far, the factors we have considered are the
operator’s experience level, social interaction abilities, and analytical skills, as well as the
effect of shift work. We have also analyzed the effect of the day of the week and its
interaction with the shifts.
The bulk of the discussions in chapter 3 focuses on developing a PHM with HR covariate.
Once this PHM is successfully obtained, we use it to propose a model that can provide a
decision-maker with a cost-benefit analysis to choose among various intervention methods
to reduce operator-related risk. The proposed revenue model makes use of the
proportional hazards model to estimate the expected machine uptime and the probability
of failure. With this model, the DM can also calculate the risk threshold, below which the
system is profitable, as well as the minimum levels for various human-related factors.
Failure Risk Analysis
90
We provide a case study of a manufacturing company and use it to demonstrate the
usage of the revenue model. The factors found to be significant in our model are experience
level, social interaction, analytical skill, shift work, and days of the week. We demonstrate
the decision-making process resulting in the highest profit. Given machine-specific factors,
we calculate minimum levels of operator factors that result in system profitability.
The list of factors we have considered thus far is by no means an all-inclusive list. There
are many other human-related factors that can affect the performance of the operator, and
thereby the system. The inclusion of additional factors can be considered as future work.
These are factors in addition to skill, shift-work, and days-of-the-week. Examples are
motivation, seasonality, reward systems, and management-employee relationships.
Another valuable future work can be to analyze the combined effects of machine-related
as well as human-related factors in a PHM. The effects may be compared and the predictive
power of the model can also be analyzed after the addition of the category of factors. There
are no previous works in the literature that have implemented such analysis. If we do
obtain a PHM that contains both MR- and HR-covariates, the methods of intervention
would be different depending on whether the major source of risk stems from the machine
or the operator. Therefore, there has to be an additional step for risk source identification.
We currently do not distinguish among the failure modes. Performing a certain wrong
task may require a simple reset, causing a ten-minute downtime, whereas performing a
certain wrong sequence may result in physical damage, requiring a lengthy maintenance
and hours of downtime. Therefore, considering different consequences for the various
Failure Risk Analysis
91
failure modes would make our analysis more realistic. Another factor that will make our
discussions more realistic is the consideration of production quality. We have taken a black-
and-white perspective on quality: if the operator is slow or causes a machine downtime, a
part is not produced. But if the part is produced, it is of acceptable quality and can be sold.
Our work will be improved if we can incorporate means to distinguish between rapid, error-
prone work and immaculate, quality-first mentality.
Another future work directly following our PHM work discussed in Section 3.6.5 may
include identifying additional HR intervention methods. We have identified two such
intervention methods, adding a Guide and reducing the production rate. These are not the
only intervention methods and many others may be possible, depending on the context. A
further future work may be to develop a deeper focus on improved estimation of the
effects of the intervention methods. This work would include a focus on accurately
predicting the relationship between the production rate change and the hazard rate
reduction. This work can also include discussions on calculating the effects of a Guide to the
skill of the operator.
92
4. OPTIMAL OPERATOR ASSIGNMENT
The human resources of an organization are among its most valuable assets. Optimal
human resource management can make a major contribution to improving the potential
productivity of the organization. In the specific context of operator assignment in skill-
based environments, expertise can be included as a factor whose improvement can
positively affect system performance.
As mentioned in the introductory chapter, even when quantified skill scores are derived
for the operators, a DM may not know how to best use these scores. This chapter provides
a framework for a DM to use operator skill as a decision variable in optimizing operator
assignment. There are four elements to this framework:
1. The first element is to develop the means to forecast the production output, in terms
of HR factors. The method we use is regression analysis.
2. The next element deals with the fact that skill scores are unlikely to be static over the
planning horizon being considered. A particular operator may be the best candidate
for assignment to a certain machine based on current conditions, but not over the
entire planning horizon. This dynamic nature needs to be captured and considered in
the analysis. To do so, we develop learning curves for the operators, based on
historic data.
3. We develop a revenue model and incorporate the learning curves into the
production output forecasting models.
OPTIMAL OPERATOR ASSIGNMENT
93
4. We use the results of the revenue model in step 3 in the objective function of a
mathematical programming model which we use for optimal operator assignment.
Connection to Previous Chapters
In Chapter 3, we analyzed the performance of a system where operators have been
assigned to machines. We know the characteristics of the machines, including their
sensitivity to the various HR-factors. We also know the characteristics of each operator. The
question explored in chapter 3 was the following: given the operator assignments, the
sensitivity of each machine to HR factors, and the characteristics of each operator, how do
we best mitigate the risk of failure, stemming from the operators? This is portrayed in
Figure 4.1 where the arrows represent the assignments of the operators to the machines.
Figure 4.1: Discussions in chapter 3 focused on failure risk analysis given the machine and operator characteristics and operator assignments to machines
Our work in this chapter is similar to Chapter 3 in that we know the characteristics of the
machines, including their sensitivity to the various HR-factors. In this chapter, we want to
assign a group of operators to various machines in order to achieve the best system
performance in terms of maximized revenue. The question explored in this Chapter is in a
way the opposite of the one in Chapter 3: given the sensitivity of each machine to HR
M1
M2 M3
M4 M5
M6 Mn
O1 O2
O3 O4
O5
O6 On
Machines are: same in operational
procedures but different in technical
characteristics
Operators have different skills,
tolerance to shift-work, and other
characteristics.
OPTIMAL OPERATOR ASSIGNMENT
94
factors, and the characteristics of each operator, how do we best assign the operators to
the machines in order to maximize system revenue. This is portrayed in Figure 4.2.
Furthermore, the work presented in Chapter 3 has a short term nature to it for the
purposes of decision-making. In its application to a case study, the planning horizon
considered is an eight-hour shift. The work in this Chapter is for long term decision-making
and the empirical study that follows at the end of the Chapter has three months as its
planning horizon.
Figure 4.2: Discussions in chapter 4 to focus on optimal operator assignment, given machine and
operator characteristics
Main Contribution of this Chapter
We create a methodology to optimally assign operators to machines based on the
sensitivity of the machine to HR factors as well as the operators’ current and forecasted
characteristics.
4.1. Literature review
A literature review shows an abundance of previous work in personnel assignment.
Much of this work focuses on forecasting the human resource requirements in order to
? ? ? ? ?
Machines are: same in operational
procedures but different in
technical characteristics.
Operators have different
characteristics; there are different
forecasts/projections for these
characteristics over planning horizon.
M1
M2 M3
M4 M5
M6 Mn
O1 O2
O3 O4
O5
O6 On
OPTIMAL OPERATOR ASSIGNMENT
95
produce a certain amount of output (Kao and Lee, 1996; Philipose, 1993). Yang et al., (2003)
focus on determining the level of engineering expertise required in ensuring the desired
production output. Li and Li (2000) consider staff skill as a factor in their analysis. Service
quality and cost minimization are two objectives in their goal programming approach.
However, both Yang et al. (2003) and Li and Li (2000) have a manpower planning scope,
where they aim to forecast the required number of personnel. Our scope is the optimal
assignment of the existing workforce. Our scope focuses on maximizing system revenue by
using the expertise of the people to optimally assign them to the various jobs.
Wang (2005) divides the operations research techniques applied in workforce planning
into four major categories: Markov chain models, computer simulation models,
optimization models and supply chain management through System Dynamics. He breaks
down the optimization model category into linear programming, goal programming,
dynamic programming, and integer programming. Our approach makes use of the former
type of optimization model. Studies such as Haas et al. (2000) and Feiring (1993) use linear
or integer programming models for optimization in operator assignment problems. Zeng et
al. (2011) consider a manufacturing environment and optimize the operator assignment
using a Pareto utility discrete differential evolution algorithm. However, none has a scope
similar to ours in terms of using learning curves of skill components of the individual
operators. The aforementioned study by Yang et al. (2003) does acknowledge the role of
learning curves but assumes it to be negligible in their study.
OPTIMAL OPERATOR ASSIGNMENT
96
Some studies, such as Malhotra et al. (1993), and Li and Cheng (1994), discuss the role of
learning on system performance. Others, such as Teyarachakul et al. (2011), consider the
role of learning on production scheduling. When it comes to considering the role of learning
on the process of workforce assignment, the literature is sparse. We were unable to find
similar work that directly uses learning curves during an optimization process for workforce
assignment. There are works, for example Nembhard and Norman (2002), followed by
Leopairote (2003), and Vidic (2008), that discuss the effects of work-sharing and job-
rotation on operator learning and forgetting. These studies come closest to our ideas and
approach. But these models aim to optimize operator learning and operator forgetting as a
result of work-sharing and job-rotation. Unlike our scope, the aforementioned works are
not motivated by revenue maximization. Nor do they use learning curves for multiple skill
components of individual operators. Therefore, our proposed methodology is unique in its
scope to enable the DM to optimally assign operators to machines based on current and
forecasted HR factors, with the aim of maximizing system revenue.
4.2. Model Development
We forecast machine output based on the factors affecting the operator working on the
machine. Learning curves are developed for the characteristics of the operator. Operator
factors as well as learning curves are then incorporated into a revenue model that
calculates the expected revenue of an operator, on a given machine, over a planning
horizon. Unlike the work presented in Chapter 3, the discussions in this Chapter are
deterministic. The possibility of machine failure is not explicitly considered but rather
OPTIMAL OPERATOR ASSIGNMENT
97
incorporated in the average output. The operator assignment is solely performed based on
the machine sensitivity to HR factors, as well as the current and forecasted HR factors.
Predicting Output in Terms of Human-related Factors
Our objective is to maximize total revenue over the planning horizon. Revenue is directly
proportional to the profit per unit as well as the output of each production unit within the
system. In a manufacturing environment, we assume each machine to be a production unit
and we predict its output as a function of HR factors. The sensitivity of each machine to HR
factors, and the characteristics of the operator working on the machine, affects the output,
and therefore the revenue.
The first step in building a mathematical model to optimize operator assignment is to
develop a model to forecast the production output. One such model can be obtained using
regression analysis. In the regression equations we obtain, the dependent variable is the
hourly production output.
Operator characteristics, the independent variable in the regression equations, are
considered on their own as main effects. But their interactions must also be considered if
they make intuitive sense in the context of analysis. Therefore, the regression equations
have a first part for variables appearing on their own and a second part for pair wise
interactions. The regression equation has the following general form:
, (1)
where
OPTIMAL OPERATOR ASSIGNMENT
98
: Estimated output of machine , per unit time, where , : number of
machines,
operator characteristic ,
: coefficients of the main effects, as applicable to machine ,
: coefficients of the interaction terms, as applicable to machine ,
and : Indices iterating through the various operator characteristics considered.
, , where l is the total number of operator characteristics
considered.
It is likely that a model will not have interaction terms passed the second level. But if the
usage of third level, or higher, interaction terms are justified, they can be added to the
model accordingly. In a case where the usage of third level interaction needs to be
considered, Eq. (1) will be expressed as follows:
.
Learning Curves
Since operators learn, their skills can change over time. The differences in learning
among the operators need to be considered in the work assignments. Consider operator A,
who has slightly higher initial skill at time zero, and operator B, who has lower initial skill
but a steeper learning curve. Over the planning horizon, total production by operator B
could be higher than by operator A (Figure 4.3). The area under the curve represents total
production over the planning horizon; therefore, operator B should be assigned to the
higher priority machine for the system to gain the additional production output.
OPTIMAL OPERATOR ASSIGNMENT
99
Figure 4.3: Quarterly production of two operators, with different projected learning curves
Revenue Model Using Regression Equations and Learning Curves
Over the course of the planning horizon, the revenue model for each machine must
include the expected hourly production of each operator over the period. This is achieved
by incorporating the learning curves into the regression equations. When using equation
(1), rather than one input value for variable , we use the integral of the learning curve
for that particular characteristic. By calculating the integral of equation, with the learning
curves built in as previously described, we effectively sum the production output over the
planning horizon.
Output is forecasted over the planning horizon using regression equations which are
linear. But the learning curves embedded in these linear equations are non-linear. Skill
evolves and the evolution is represented by the learning curve equations. But regardless of
the form of the learning curves, they serve to provide the regression equation with one
value of skill at a particular instant in time. Therefore, there is no issue in embedding a non-
linear model within an analysis that is linear as a whole.
31
32
33
34
35
36
37
0 120 240 360 480 H
ou
rly
Pro
du
ctio
n
Time (hours)
A
Production
B
Production
OPTIMAL OPERATOR ASSIGNMENT
100
Over the next planning horizon, [ ], the expected revenue, U, of operator s
working on machine is represented by the following:
, (2)
: Sale price of units produced by machine ,
: Time variable within the planning horizon .
: Learning curve equation of operator s, a function of , and .
4.3. Optimization Model
Revenue is maximized over the next planning horizon by using the revenue model
expressed in Eq. (2) to assign operators to each machine. We take on a mathematical
programming approach within the context of assigning a number of operators to a set of
machines, given a set of criteria and an objective function. We define this objective function
as the total revenue over the period and our aim is to maximize it. As previously stated, we
make the assumption that every unit produced can be sold. The binary decision variable
operator is assigned to machine (making product ) and is zero otherwise. This
model is a simple assignment problem where m jobs are assigned to m individuals
(Emrouznejad et al., 2012). In our case, each operator is assigned to one machine, and each
machine gets assigned the appropriate number of operators, .
OPTIMAL OPERATOR ASSIGNMENT
101
max
s.t.
By solving the above linear programming problem, we can assign individual operators to
specific machines to optimize the departmental revenue over the planning horizon.
4.4. Empirical Study
Once again, we use the case study of Alpha, introduced in Section 3.6. In addition to the
case study details previously provided, the following facts are pertinent to the discussions in
this chapter:
▫ Product demand is high and all gears produced can be sold. Therefore, the decision
maker should aim to produce the highest number of gears, while considering the sale
price of each gear.
▫ The machine operators are assigned to the machines on a “random” basis. The operators
have transferred into Alpha at different times and each one was assigned to a free
machine when they first joined. There was no particular policy in the assignment. They
do not switch machines in the duration of our study. There is no personnel turn-over in
Alpha during our analysis. As an example, the operator who is assigned to the Ring gear
machine works on the Ring gear machine every week.
OPTIMAL OPERATOR ASSIGNMENT
102
▫ There are nine operators on the three shifts and the three Kappa machines. Operators
are assigned to machines, but not to shifts. They go through a weekly shift rotation.
▫ The Kappa machines are almost identical the only difference being the external tooling
for the gear they produce. The operational procedures are the same and if a DM decides
to transfer an operator from one line to the next, the adjustment time for the operator is
negligible.
▫ There are no union rules prohibiting management from transferring operators from line
to line and the operators have no preference for a particular Kappa machine.
4.4.1. Predicting Output in Terms of Human-related Factors
The production data for one of the machines, Pinion, is incomplete. Therefore, it is
omitted from the output analysis. Its data is complete for the operator skill assessments;
therefore, it is included in the analysis of operator skills where overall averages are
calculated. Table 4.1 displays the coefficients of the variables that form the regression
equations for each of the machines. Based on Eq. (1), the general form of the regression
equation for Alpha is the following:
.
Coefficient estimates not significantly different from zero for all machines have been
omitted. Similar to our model building approach in Chapter 3, we primarily use a backward
selection model and complement it with AIC. Components of skill (Experience, E; Social, S;
Analytical, A) and their interactions are the only ones appearing in the regression models;
shift indicators are not found to be significant. To validate the obtained equations, we have
OPTIMAL OPERATOR ASSIGNMENT
103
looked at R2 values for assessing model fit and checked for linearity of residuals,
homoscedasticity, independence of errors, testing for influential cases, no perfect
multicollinearity. We also split the data to cross-validate the models. The analysis appears
in Appendix D. Details of the data set appear in Appendix G.
Driven 18.664 0 0.051 0.214 0 0 0
Drive 29.935 0 0 0.086 0.001 0 0
Ring 23.721 0.090 0 0 0 0 0.001
Table 4.1: Regression Equation Coefficients, significant at p < 0.01
4.4.2. Learning Curves
The operators at Alpha have skill assessments on a quarterly basis in the nine-month
duration of our study. Based on these assessments, learning curves for each operator’s skill
component are developed by fitting curves to the historical skill scores. At the time of this
analysis, each operator has been assessed three times. Each assessment round consists of
two expert evaluations which are done one to three weeks apart. Therefore, for each
individual, we can fit a curve using six points.
For the analytical skill, we expect everyone to learn and improve their technical skills as
they spend more time working on the machines. There are a few operators who have
already achieved the highest level of expertise, as measured by our data collection tools.
Therefore, the curve is replaced by the score of 100. The power form is the form we have
chosen for the learning curves as it is the most common one in the literature for groups
(Agrote et al, 1995).
OPTIMAL OPERATOR ASSIGNMENT
104
In general, for an environment like that of Alpha, where all operators are between 40
and 60, it is best to consider the effect of aging. Learning may not occur as rapidly as it
might in an environment with a much younger workforce. There may also be quicker
forgetting effects, especially if an operator is on an extended leave for medical reasons.
However, in our analysis, we assume forgetting is negligible because the operators are
working on the machines every week. We also assume the trend in learning is continuous
along the path of a power curve. It may stabilize and plateau, but this is the case with
operators of all age groups.
As mentioned in the skill assessment process in Section 3.6.1, the twelve operators are
assessed twice within a month during each of the three previous quarterly rounds of skill
evaluation. The validity of the power form for the learning curves for analytical skill is
reinforced by fitting a curve to the average score of all operators on each week. Each point
on the graph presented in Figure 4 represents the average of all operators who were
assessed in that particular week.
We also show the curve fitted to these points, as well as the resulting R2.
Figure 4.4: Learning curve for aggregate analytical skill scores of all operators over all weeks
z = 10.368m0.2803 R² = 0.7709
0
20
40
60
80
100
0 500 1000 1500
An
alyt
ical
Ski
ll Sc
ore
Hours of production
OPTIMAL OPERATOR ASSIGNMENT
105
The non-negative nature of the learning curves for the analytical skill is not the case with
social interactions. In the case of this skill component, some operators experience a
downward trend. It seems that an operator may get to a certain knowledge level where he
believes no fellow operator can be of assistance in his trouble-shooting efforts. In this case,
his social interaction score will actually be lower than when he had a lower analytical
and/or experience score. The literature is very sparse on works that look at the relationship
between work-related communication and operator skill. There are prior works, such as
Allen (1977), that describe the diminishing of work-related communication with increased
physical distance of the operators. The closest works we found in the literature on the
relationship between work-related communication and operator skill is by Woodman et al.
(1993) and Singh and Fleming (2010). In their analysis, they describe the relationship
between team performance and time as an inverted U. Initially, the operators learn from
each other and the creativity and expertise of the whole group is increased as a result. After
a point though, the operators become clones of each other. A downward trend begins due
to a loss of diversity (same views) and the fact that group identity rejects outside views. In
our case, we state that as time goes on, the operators gain expertise. Therefore, the
relationship described in the two abovementioned works is indirectly applicable to us. Even
though this description of the effect of communication on performance is at the team level,
it may be transferable to the social abilities of the individual operator as well.
When the data for all operators is grouped, the pattern is that of a nearly-zero-sloped
straight line (Figure 4.5). As time on the job progresses, there are individuals who interact
more. But the social skills of these operators seem to be counteracted by other operators
OPTIMAL OPERATOR ASSIGNMENT
106
whose social interaction score diminishes as they spend more time on the job. Unlike the
curves for the analytical skill, we do not have a common form we can use for the social
interaction scores. This makes intuitive sense as social interaction has its roots in the
personality of the operator and all operators are different. In addition, the quality of our
curve fitting is not as high as for the analytical skill, with lower R2 value for the curves.
Weeks of Production
Figure 4.5: Learning curve for aggregate social skill scores of all operators over all weeks
The learning curves obtained for Alpha’s fourth quarter appear in Table 4.2. As the
experience level can be obtained directly from employee records of the operator’s duration
in Alpha, as well as his overall exposure to gear manufacturing processes, no curve is
associated with this component. The independent variable, m, is in terms of hours of
production since the start of the study.
0%
10%
20%
30%
40%
50%
60%
70%
80%
0 10 20 30 40
Soci
al In
tera
ctio
n S
core
OPTIMAL OPERATOR ASSIGNMENT
107
Operator Experience Level
Learning Curve and R2 obtained
Social Interaction Analytical Skill
1 100.00 , 100
2 71.43
3 57.14
4 100.00 ,
5 71.43 ,
6 71.43 ,
7 100.00 ,
8 85.71 ,
9 57.14 ,
Table 4.2: Learning Curves of Operators’ Skill Components
4.4.3. Revenue Model Using Regression Equations and Learning Curves
To further clarify our method of incorporating learning curves and production forecasting
models into a revenue model, we consider operator 8 working on the Drive line and
determine his expected revenue over the fourth quarter.
▫ Per Table 4.1, the regression equation for the Drive line is as follows:
, where : experience level; : social
interaction; and : analytical skill.
▫ We are interested in the production hours over fourth quarter. In each quarter, the hours
are: hours/quarter.
Therefore, by the end of the third quarter, we have gone through 1440 hours.
▫ Using the learning curves for operator 8 from Table 4.2, and the sale price, = $90,
we arrive at the following expected revenue output for the fourth quarter:
.
OPTIMAL OPERATOR ASSIGNMENT
108
4.4.4. Optimization Model and Discussion
Each machine runs three eight-hour shifts per day and thus requires three operators.
There are nine operators to be assigned to the three machines. Using Eq. (2), the operators’
expected revenue on each machine forms in the mathematical programming model.
We have also categorized the operators into three, “High” (H), “Medium” (M), and “Low”
(L), based on the ranking of their total production output. This is expressed in Table 4.3,
along with the categorization of the operators.
Operator Driven Drive Ring Total Rank Category
1 2,300,460 1,961,640 1,710,470 5,972,570 2 High
2 1,932,940 1,758,290 1,599,590 5,290,820 7 Low
3 1,662,920 1,581,020 1,421,930 4,665,870 9 Low
4 2,305,860 1,965,710 1,714,320 5,985,890 1 High
5 2,248,420 1,854,290 1,577,260 5,679,960 5 Medium
6 1,900,750 1,687,530 1,432,420 5,020,700 8 Low
7 2,235,670 1,881,730 1,633,030 5,750,430 4 Medium
8 2,082,570 1,816,660 1,565,930 5,465,170 6 Medium
9 2,340,990 1,881,160 1,629,640 5,851,780 3 High
Table 4.3: Operators’ Expected Quarterly Revenues
4.4.5. Optimal Solution Compared to Solutions of Other Methods
The hourly production output calculated from each machine’s regression equation is
directly proportional to the skill of the operators assigned to that particular machine. Three
operators are assigned to each line to cover the three shifts. The scores for the skill
components of each operator appear in Table 4.4.
OPTIMAL OPERATOR ASSIGNMENT
109
Line Operator Experience Social Analytical
Driven gear
1 100 67.71 100
2 71.43 70.83 63.64
3 57.14 40.10 48.87
Drive gear
4 100 67.71 97.78
5 71.43 66.67 93.80
6 71.43 41.67 60.27
Ring gear
7 100 53.13 95.85
8 85.71 57.29 80.15
9 57.14 84.38 82.35
Table 4.4: Operators’ Skill Components
To date, the department has not used any decision criteria for its Operator Assignment
(OA). As previously described in Section 4.4, positions have been filled on a random basis.
The department’s quarterly revenue, based on production on 5-day weeks in 12-week
quarters, is calculated using the unit sale prices of $110, $90, and $90 for the Driven, Drive,
and Ring gears, respectively.
The revenue amount resulting from the random assignment is compared to other
assignment policies, including the figure we obtain through our optimization procedure. In
one possible approach, the DM at Alpha can use the current skill levels of the operators to
assign the operators. Under this “simple” assignment policy based on current skill rankings,
if operator A’s skill scores are higher than operator B, the DM assigns operator A to
machine 1 which has a higher sale price than machine 2. Using the learning curves of the
operators is unique to our approach and would not play a role in the assignment policy by
the DM in this hypothetical scenario. It is, however, used in our calculation of quarterly
revenue. In obtaining the result of our approach, we use a Linear Programming (LP)
OPTIMAL OPERATOR ASSIGNMENT
110
technique to optimize the OA for the goal of revenue maximization. We use the branch-
and-bound method, embedded in LINGO, an optimization software application. Table 4.5
displays the results of the various approaches and their comparison to our optimal OA. As
can be seen, our OA results in a significant amount of quarterly revenues.
Assignment Policy Quarterly Revenue ($) Revenue difference with optimal OA
($) (%)
Random 16,232,450 794,950 4.9
Simple skill ranking 16,743,970 283,440 1.7
Optimal OA 17,027,400 - -
Worst case 16,077,900 949,500 5.9
Table 4.5: Comparing system revenue under various OA approaches
We re-run the assignment model, without considering the effect of learning. Skill scores
start at the assessed level at the start of the fourth quarter and remain flat throughout the
entire planning horizon. System revenue is calculated to be $16,542,060. When we
compare this figure to the revenue obtained from the optimal assignment scenario,
$17,027,400, we can see a difference of $485,340, or 2.9% less revenue. This difference in
revenue is a relatively important and goes to show that we achieve better results when we
are able to include more information in the model. The additional information included in
the model in this case is the learning effects.
4.4.6. Sensitivity Analysis
We perform a sensitivity analysis on our model to verify its performance. There are
certain changes we can apply to the various resources used by our model. The resources to
alter are product price, skill score, and available time. Before making each change, we can
OPTIMAL OPERATOR ASSIGNMENT
111
have a certain expectation of how the model should behave. We change the resource level
and check the model performance against expectations.
Sensitivity Analysis: Product Price
In our analysis thus far, we have used the sale prices of $110 for the Driven gears, and
$90 for Drive and Ring gears. We perform a sensitivity analysis on our model by changing
the prices and observing the behavior of our model. A low and a high price is selected for
each gear; they are (90,120), (80, 100), and (70, 90) for the Driven, Drive, and Ring gears,
respectively. Table 4.6 shows the resulting OA for the eight scenarios.
There are only two assignment scenarios when the Ring machine is assigned better
operators than the Drive machine and that is when Ring’s selling price is higher. Looking at
the regression equations for these two lines, we can see very similar coefficients and,
therefore, similar sensitivity to skill level. But Table 4.3 shows us that Drive’s performance
in general is higher than that of Ring. In most cases, Ring price is lowest among the three
products and, as a result, this machine is assigned the three lowest skilled operators. The
only exceptions to this rule are the two cases where Ring price is higher than Drive price. In
those two cases, the operators commonly assigned to Drive and Ring get their assignment
reversed. The Ring sale price is never higher than the Driven and as such, it is never
assigned the operators assigned to the Driven machine. Therefore, in the case of Ring, the
model assignments are aligned with what we may expect.
OPTIMAL OPERATOR ASSIGNMENT
112
Driven price
Drive price
Ring price
Driven assignment
Drive assignment
Ring assignment
System Revenue
90
80 70 M-M-H H-H-M L-L-L 14,154,100
90 M-M-H L-L-L H-H-M 15,178,500
100 70 M-M-H H-H-M L-L-L 14,721,400
90 M-M-H H-H-M L-L-L 15,718,500
120
80 70 M-H-H H-M-M L-L-L 16,017,400
90 M-M-H L-L-L H-H-M 17,041,600
100 70 M-M-H H-H-M L-L-L 16,582,800
90 M-M-H H-H-M L-L-L 17,579,900
Table 4.6: Operator Assignment for Different Gear Prices
Furthermore, one might expect the Driven line to get assigned the three “high”
operators when its selling price is at the high end of $120 and Drive and Ring are at their
low end of $80 and $70, respectively. If we proceed with this intuition and assign the three
best operators to the Driven machine, system revenue would be $15,838,467. But this
would not account for the sensitivity of each machine to the different skill components.
Once we apply our model and consider all aspects, we get the assignment presented in
Table 4.6. With our optimal assignment scenario, system revenue is $16,017,400. This is a
higher revenue amount and therefore, the model has properly maximized system revenue.
Sensitivity Analysis: Skill Scores
The values appearing in Table 4.3 are calculated based on the current skill levels and
learning curves (shown in Tables 4.4 and 4.2, respectively). Using the original gear prices,
we change the social interaction score of the operators to observe the differences, if any,
on the optimal OA and system output. There are some operators who have already reached
the maximum score (based on the current criteria) on the other skill components of
experience level and analytical skill. Therefore, the social interaction component is a
OPTIMAL OPERATOR ASSIGNMENT
113
suitable one to alter as it can go in either direction at the desired step size of 5% or 10% for
all operators.
The social skill of the three operators in the “high” category is reduced by 10% and that
of the three “medium” operators is raised by 5%. We recalculate the aggregate production
figures for the six operators using the updated Social skill values. Despite the change in the
individual values, the changes are not sufficiently large to modify the ranking of the six
operators, and as a result, the medium and high categorizations of the six operators remain
unchanged. As such, one may expect the operator assignment to also remain unchanged.
As scenario 1, we assume no change in categorization should result in no change in OA.
We recalculate the system revenue using the new values of social skill, but without
changing the operator assignments. But operator categorization is not the only factor that
can affect operator assignment. We have to be cognizant of the other factors, including
production output based on current and forecasted skill, product sale price, and the
sensitivity of the machine to the skill components. Therefore, as scenario 2, we run our
model and we arrive at a modified OA, resulting in higher revenue compared to scenario 1
(Table 4.7). Therefore, the model works well in providing us the assignment that results in
the highest possible revenue. The model has captured and considered all the factors that
should determine the optimal OA.
OPTIMAL OPERATOR ASSIGNMENT
114
Original Scenario Scenario 1 Scenario 2
Driven OA 5-7-9 5-7-9 4-5-9
Drive OA 2-4-6 2-4-6 1-2-6
Ring OA 1-3-8 1-3-8 3-7-8
Total Revenue ($) 17,027,400 16,800,960 16,802,360
Table 4.7: System Revenue under various OA Scenarios
It is interesting to note that the particular sensitivity analysis described above can be
used in determining the value for various programs for the operators. Examples of such
programs can be machine-related technical training to enhance analytical skill, or
motivational programs and team-building exercises targeting improved social interaction
among the operators. By estimating the effect of any program on the skill component(s) of
the operator(s), the initial price can be compared with the eventual addition of quarterly
revenue for a cost-benefit analysis.
Consider the example where Alpha enrolls its nine operators in a team-building exercise
that is estimated to improve social interaction skill of each person by 5%. The total cost of
the exercise is $50,000. Based on the figure of $17,027,400 presented in Table 4.5, if our
model calculates the departmental revenue to be greater than $17,077,400 using the new
social skill scores, the exercise as a project has a payback of less than three months.
Sensitivity Analysis: Available Production Time
When calculating the results of the optimal OA, we used the same duration of five 8-hour
shifts per week, for 12 weeks for all three machines, resulting in a total of 480 hours over
the planning horizon. In a hypothetical scenario, this criterion is changed to 360, 480, and
600 hours for the Driven, Drive, and the Ring gear line, respectively. The Driven gear line is
OPTIMAL OPERATOR ASSIGNMENT
115
the one with the highest product sale price. We allow it to run the least number of hours
and observe the effect on OA.
Our model provides a result that makes sense; assignment is updated to provide the
highest system revenue. Under this new scenario, the Ring line has the highest production
time available. Therefore, the worst operator previously assigned to it (from the “low”
category) is replaced by an operator from the “high” category. The operator swap is done
with the Drive line rather than the Driven line because the Driven gear is sold at a higher
price ($110), compared to the Drive line ($90).
This particular sensitivity analysis may be beneficial to the DM for situations when there
is a resource shortage. Examples of such resources may be production raw material or
human operators. In such cases, the DM may want to dedicate the scarce resources to two
machines and shut down the third machine. The decision as to which machine to shut down
can come from an analysis similar to what we have done in this sensitivity analysis.
Consider an example where there is a mandate to reduce manpower by one-third for
two weeks. The DM can reduce the production time of one machine by 80 hours, reducing
the total quarterly running time of one of the three machines from 480 to 400 hours. The
total production time for the other two machines remains at 480 hours over the quarter.
Our model provides the system revenue for each of the three scenarios of Driven, Drive, or
Ring running for 400 hours. The DM would then choose the best option accordingly. This is
shown in Table 4.8:
OPTIMAL OPERATOR ASSIGNMENT
116
System revenue ($)
Driven at 400 hours, Drive and Ring at 480 hours 15,868,900
Drive at 400 hours, Driven and Ring at 480 hours 16,112,300
Ring at 400 hours, Driven and Drive at 480 hours 16,127,500
Table 4.8: System Revenue under different production times
As can be seen, in this case, it makes economic sense for the DM to shut down the Ring
machine for two weeks (80 hours) and keep Driven and Drive for the entire duration.
4.5. Concluding Remarks and Future work
In this chapter, we have discussed the personnel assignment problem and created a new
methodology to solve it using a linear programming approach to optimize personnel
assignment. The result of the mathematical programming model is superior to the current
first-come-first-assigned scenario. In developing the mathematical model, we use output
forecasting models in the form of regression equations stemming from our case study’s
data set. In these equations, output is expressed as a function of operator skill. Skill is
categorized into various components and quantified by consulting system experts in
questionnaires and through observational studies. We initially considered another human-
related factor, the effect of shift-work on the production output, in the regression analysis.
However, this effect was not found to be significant in any of the three product lines.
In addition to forecasting the production output in terms of skill, we use operators’
learning curves for each of the skill components. The learning curves are projected over the
planning horizon and incorporated into the production forecasting model. This combination
is used in the objective function of our linear programming model. The result achieved by
OPTIMAL OPERATOR ASSIGNMENT
117
our optimal OA is compared to the random assignment currently in practice and a simple
linear programming technique, not considering the forecasted skill levels. Our method is
found to result in higher revenues compared to the other two methods. For validation
purposes, a sensitivity analysis is performed to confirm the validity of our model. Three
resources, product selling price, skill score, and production time, are varied. In each case,
the model adjusts the OA to maximize system revenue. As we discuss the sensitivity
analysis, the main focus is on model validation. However, we also have some discussion on
ways our proposed sensitivity analyses can be used for planning and decision making.
Other than the general future work of considering additional HR factors, a possible
future work stemming from the work in this chapter is to apply the approach in this chapter
to other case studies in different environments. These can be non-manufacturing
environments, such as project management, where tasks are non-repetitive. In addition,
one can consider manufacturing environments where there are different machine types.
This can lead to the additional complexity of the personnel having different learning curves
on different machines. The combination of non-repetitive tasks with the presence of
different task types can lead to our consideration of operator forgetting in addition to
operator learning.
In addition, the same methodology may be pursued with objectives other than revenue
maximization, such as total maintenance cost. Differing failure rates of machines based on
the human operator would lead to differing failure rates and this can be a part of the
objective function. As further future work, one may consider a social experiment where the
OPTIMAL OPERATOR ASSIGNMENT
118
operators are assigned to the machines based on their preference rankings. Over the course
of planning horizon, the results of the assignment based on the operator preferences can
then be compared to the optimal assignment we have developed using the model
developed in this chapter.
Lastly, an important future work can be the determination of the length of the planning
horizon to consider. In the work presented in this chapter, we have assumed the DM can
determine the length of the planning horizon. In the empirical study, the duration is taken
to be three months so that it is aligned with the quarterly operator assessment cycles.
However, there can be further work to provide a systematic tool for choosing an optimal
length for the planning horizon. Factors related to machine performance or operator
learning can be used to determine this optimal length.
119
5. EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
In chapter 3, we analyzed a system where HR factors can affect the failure risk of the
system, and developed a model for a DM to perform a cost-benefit analysis to choose the
best intervention method for mitigating failure risk. In this chapter, we consider the same
type of human-machine system where failures can occur as a result of operator error as
well as the machine’s physical components. The purpose of our work is to aid the DM with
planning activities by providing him/her with the expected production output of each
operator. Examples of such planning activities are operator assignment, calculating the
upper bounds on production materials required, estimating maintenance resources, or
establishing operator training programs.
For a human-machine system, as operators learn, there may be two benefits: 1)
improved production output rate, and 2) reduced human error rate. Both of these factors
result in higher performance of the system. The study of the effects of human learning on
the performance of the human-machine system has received much attention; studies such
as Yelle (1979) and Dutton and Thomas (1984) state that the time to produce a single unit
continuously decreases with the processing of additional units. Operator learning certainly
plays a role in manufacturing environments and learning effects have been proven to exist
by many empirical studies (e.g. Venezia, 1985; Webb, 1994). Considering the learning of the
individual operators and using it as a variable in performance optimization can have many
benefits for an organization. Onkham et al. (2012) discuss the benefits an organization may
realize by providing training to the employees, resulting in increased skill and knowledge as
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
120
well as reducing human error. This negative relationship between knowledge and failure
rate is also discussed by Fritzsche (2012).
We aim to forecast the production output of an operator over a planning horizon,
considering the effects of learning, among other factors. The operator gains in expertise
based on the amount of time he has spent working on the machine, while the machine is
operational. If the machine is down, he is not working and consequently not gaining
experience. Therefore, his learning curve is proportional to machine uptime. Furthermore,
as the operator gains in expertise, he is less likely to make mistakes; therefore, the
probability of success (or failure) at various stages along the planning horizon is not
constant. This probability can be calculated at each stage based on the appropriate value of
operator-related factors, including those affected by learning. The analysis is performed in
intervals and the probability of failure at each interval depends on the values of the MR-
and HR-factors at the previous interval. This is the reason we choose a Markov chain
approach. A Markov chain is a stochastic process that possesses the Markov property: when
we know the present state of the process, the future development is independent of
anything that has occurred in the past (Rausand and Hoyland, 2004). In our case, we have to
use a non-homogenous Markov chain as operator expertise, learning, and working
conditions, are a function of time.
Our Markov Chain model quantitatively captures the positive effects of learning both in
terms of raised skill, leading to increased output, as well as reduced human error, leading to
decreased machine downtime. In general, machine downtime can be caused by MR or HR
factors. There are many reliability and failure risk analysis models that deal with the
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
121
machinery. However, there are few that incorporate the role of human operators on
uptime and overall performance. To obtain the probability of machine failure due to both
MR- and HR-factors, we use the PHM, due to the merits and applications discussed in
Section 3.3.
There can be numerous applications for calculating the expected value of an operator’s
production output, considering the effects of learning. Similar to Chapter 4, we consider
operator assignment as one possible application of the discussions in this chapter. We can
calculate the expected production output for each operator on each machine and use the
results as input factors into the objective function of an operator assignment programming
model. By optimizing this model, we maximize the system performance, thus maximizing
system revenue.
Connection to Previous Chapters
This Chapter uses elements from both Chapters 3 and 4. In order to calculate the
probability of machine failure at the various stages along the planning horizon, we use a
PHM which includes HR covariates. This follows from our work in Chapter 3. Over the length
of the planning horizon, we do not expect the HR factors such as skill to remain static; we
use learning curves to capture the effect. Lastly, our model provides us with the expected
number of operational time intervals the machine will experience. We then use regression
equations to forecast the level of output based on the number of operational intervals. The
learning curves and the regression equations follow from our work in Chapter 4.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
122
Unlike Chapter 3, and similar to Chapter 4, the work presented in this Chapter has a long
term focus for decision-making. The empirical work presented at the end of the Chapter has
a planning horizon of three months.
Main Contribution of this Chapter
1. We use a Markov chain approach to forecast production output, considering operator
learning.
2. The work we present can add value in the interface of operations management and
human resource management. This method can provide a return-on-investment
analysis for training cost vs. additional revenue. The additional revenue would be gained
as a result of more output produced by the operators whose skill is improved due to
training.
5.1. Literature Review
When one aims to analyze the effects of human-related factors on failure risk analysis of
human-machine systems, there are numerous studies such as those using human reliability
analysis techniques (Cacciabue, 2005; Chang and Wang, 2010) or failure modes effects and
analysis (Pillay and Wang, 2003, Seyed-Hosseini et al., 2006). However, when this analysis is
to be used for equipment uptime or system performance analysis rather than just risk
management, the literature is sparse. Horberry et al. (2010) discuss human factors and their
effects on operations and maintenance in a mining context but do not attempt failure
prediction. Similarly, Kolarik et al. (2004) develop a model to monitor and predict an
operator’s performance using a fuzzy logic-based assessment. But the purpose of their work
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
123
is solely to provide a human reliability assessment, without providing any methods for risk
reduction. Blanks (2007) discusses the need for improving reliability prediction, paying
special attention to human error causes and prevention, but does not mention any
predictive techniques for human reliability. Carr and Christer (2003) and Dhillon and Liu
(2006) focus on the maintenance workforce performing repair work at times when
machines are not being used for production purposes. Reer (1994) discusses human
reliability in emergency situations, not regular production. There are works in the literature,
such as those by Peng and Dong (2011) and Iakovou et al (1999), that use a Markov chain
approach for uptime prediction. However, none use human-related factors in their failure
risk analysis and the calculation of transition probabilities. Therefore, the focus of all
aforementioned works differs from ours in that we aim to predict the uptime of production
equipment, based on the analysis of the risk of failure stemming from the human operator.
In addition to the scarcity of the previous works analyzing the performance of human-
machine systems from a human perspective, there are even fewer that do so while
considering human learning. Biskup (2008) performs a state-of-the-art review on the effects
of learning on production scheduling. Li and Cheng (1994) and Teyarachakul et al. (2011),
also study production scheduling and consider the effects of both learning and forgetting.
But neither study has a focus on failure risk analysis; nor do they focus on effects of learning
on decreased human error rate and improved system performance.
Malhotra et al. (1993) discuss the role of learning on system performance. But the
discussions are based on optimizing the cross-training of employees to strike a balance
between flexibility and throughput loss due to forgetting and training. Similar to the
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
124
aforementioned studies in the area of scheduling, this study does not have a detailed focus
on failure risk analysis or increased production output as a result of operator learning.
Similarly, there are works by Nembhard and Norman (2002), Leopairote (2003), and Vidic
(2008), that discuss the effects of work-sharing and job-rotation on operator learning and
forgetting. But these models aim to optimize operator learning and forgetting as a result of
work-sharing and job-rotation. Unlike our scope, the aforementioned works do not attempt
to predict equipment uptime using a failure risk analysis based on human-related factors.
Nor do they use learning curves for skill levels of individual operators. Adamides et al.,
(2004) consider human learning, among other factors, to lead to increased productivity of
maintenance activities. However, they do not distinguish among the various individuals and
consider the same rate of productivity improvement for all individuals based on a certain
duration working on the particular product.
5.2. Markov Chain Approach
We aim to calculate the expected production output of an operator on a machine, over a
planning horizon. To achieve this, we need to know the possible states of the system at the
end of the planning horizon, along with the state probabilities and their corresponding
production output levels.
To turn our problem into a feasible one to solve, we take the simplifying yet realistic step
to discretize the planning horizon into individual time intervals; the intervals are then
analyzed using a non-homogenous Markov chain. For each of the resulting N intervals, the
probability of failure/survival is calculated. We start from an initial condition with a certain
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
125
machine working age and an initial level of operator expertise. At the first interval, there
can only be one of two outcomes: failure or survival. There is a transition probability
associated with each outcome and this probability is calculated using a PHM. Once again,
the fact that the model accommodates the inclusion of HR covariates, in addition to MR
factors, makes the PHM a suitable model for our analysis.
If the machine survives one interval, it reaches the beginning of the next one and again
faces the two outcomes of survival or failure. However, if the machine fails over an interval,
it must remain in repair for a certain period of time, D. We make two assumptions about D.
The first is a simplifying assumption, stating that D is fixed, regardless of the type of failure.
A realistic example is to set D equal to the MTTR. The second assumption is that the
machine is brought back to zero-age following the repair. Therefore, in the general
framework of our approach, multiple states are possible at each time interval. Each state
can lead to at least one path, for repair, and at most two paths, for failure/survival. There is
a transition probability associated with the paths leading from each state.
Hence, the state space of the Markov chain is represented by a three-dimensional
vector: , where . Given the range of these
variables, as well as the fact that in this chapter, we are analyzing a specific planning
horizon, both parameter space and the state space are discrete and finite. For an interval
, the values of the three variables are considered at the instant in time
immediately before n. This is further explained in the next paragraph. The quantity
n , is a discrete indexing parameter and can be thought of as the
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
126
machine’s global age, calculated from the start of the planning horizon. There is a set of
initial condition that apply at .
The first variable, i, expresses the cumulative number of time units that the machine has
been operational since the start at . This determines the appropriate PHM covariate
level, used to calculate the hazard rate. Only covariates whose values can be determined by
a degradation-type model can be considered in our model. In the case of operator
expertise, the values are forecasted using learning curve equations. It is this effect of
learning, reducing human error rate over time, which makes our process a non-
homogenous Markov process. The second variable, a, provides the machine working age,
again a necessary factor in the calculation of the hazard rate. At each failure, a is reset to
zero and remains at zero during repair time. The last variable, d, states the remaining repair
time. A positive d indicates the continuation of the repair procedure; the machine will not
be operational over the next time unit. According to the convention we have chosen, the
range for values for d is . If we experience a failure at , by the time we
reach the end of the interval at n, we have already had 1 time unit of repair completed. This
is due to our assumption that the failure occurs at the beginning of the interval .
Therefore, time units remain. This approach slightly underestimates production
because the production output up to the instant of failure within the time interval is
ignored. An alternative approach is possible where we may consider the failure to occur at
the end of the time interval. This approach however, slightly overestimates the production
output. In this chapter, we choose the former approach and consider the effect to be
insignificant when the individual time interval chosen is short.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
127
The possible transitions for the individual variable are:
, but not all combinations of these
individual transitions are possible at any . Figure 5.1 displays a state space diagram for an
example where for a planning horizon of . As evident, each state can be
sufficiently described by relying strictly on the previous state(s) leading to it.
Figure 5.1. Typical state space for
The state probability at is based on: 1) all the states at that can lead to the state
at ; and 2) the path from all these previous states to . At the initial moment, for
simplicity, we assume an operational machine, with zero working age, and operators with
an initial set of skills but no work experience. The more general case, where i and a are non-
zero, is presented in Appendix F.
Let be the probability of stage n having i operational intervals thus far, on a
machine that has a current working age of a, with d time units remaining in the repair
interval. The initial condition at is expressed as follows:
(0,0,0)
(0,0,2) (0,0,1) (0,0,0)
(0,0,2) (0,0,1) (0,0,0) (0,0,2)
(1,1,0)
(1,1,0)
(1,0,2) (1,0,1) (1,0,0)
(2,2,0)
(2,0,2) (2,0,1)
(3,3,0) (3,0,2)
(4,4,0)
(1,1,0)
(1,0,2) (1,0,1) (1,0,0) (2,1,0) (3,2,0) (4,3,0)
(2,2,0)
(2,0,2) (2,0,1) (2,0,0) (3,1,0) (4,2,0)
(3,3,0)
(3,0,2) (3,0,1) (3,0,0) (4,1,0)
(4,4,0)
(4,0,2) (4,0,1) (4,0,0)
(5,5,0)
(5,0,2) (5,0,1)
(6,6,0) (6,0,2)
(7,7,0)
Survival Failure Repair
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
128
1, 0
0, otherwise
i a d
.
We can progress recursively to calculate the state probability at each future state. The
recursive function expresses the state and the transition probability. The transition
probability of failure, q, is calculated using the PHM discussed in the previous Section. The
factors pertinent to the PHM are the number of operational time units, i, (affecting
operator skill), as well as working age, a. Global age, n, does not play a direct role in the
calculation of the hazard rate. As a result, , representing the probability of
failure while transitioning from a state at n, with conditions and , to a state at n+1, is
expressed as . The recursive formula representing the transition probabilities of the
proposed Markov chain is as follows:
0
( ,0, 1 1), 0 0 1 ,
( , , ) ( , ) ( , ,0 1), 0 1, ,
1 ( 1, 1) ( 1, 1,0 1), 0 0,
i
b
p i d n a d D
p i a d n q i b p i b n a d D a i n N
q i a p i a n a d
Where, is the probability of failure and calculated as follows, using the discussions in
Section 3.4:
,
The first term of the recursive formula represents the ‘repair scenario’. In this scenario,
the working age, a¸ is reset to zero and remains at zero until the repair is completed and the
machine is operational again. During the repair period which lasts D time units, the value of
d is reduced by 1 time unit as the machine progresses over subsequent stages.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
129
The second term of the recursive formula represents the ‘failure scenario’. This is where
the machine starts a new repair interval. As such, at stage n, working age, a, is set to zero
and repair time remaining is set to . This means the machine is operational just before
stage and can be in any state with with probability and fails in
with probability .
The last term of the recursive function represents the ‘survival scenario’. The fact that
the machine is operational at , with , and means that in the previous stage,
, the machine was operational or it had just finished a repair interval. Therefore, the
state probability at is and the machine did not fail over
with probability .
We continue to use the recursive equations over the entire planning horizon until we
calculate the probabilities of all possible states at stage N, the end of the planning horizon.
We have obtained some properties of the recursive formula that can reduce the
calculations required over the entire horizon. These properties are presented in Appendix E.
At the final stage, , we are interested in the expected output value at each level of
i. This is calculated by multiplying the state probability and the output level at the particular
i value. The output level, given i operational time units, can be calculated by a forecasting
method, such as regression. In doing so, operator characteristics can be considered as
independent variables in the regression equations. The machine’s production output level,
, represents the dependent variable, given expertise gained over i previous operational
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
130
time units. represents the production output for a single time unit only. If there is no
learning involved, can be represented by a constant value, .
We introduce to represent the total production output over i operational time units
by . Then in the case of a constant value, , the total expected output
over the horizon, , would be the product of and the expected number of total
operational time units over the planning horizon, , that is .
But we do consider operator learning in our analysis. This means we may have cases
where , for any i. To calculate each , we use the learning curves to
forecast the expertise levels for the appropriate number of operational time units, i, that
has helped the operator gain knowledge.
At N, we are only interested in i. The probabilities of the possible states for each i are
summed, . Then we can calculate the expected number of
operational time units over planning horizon: . This probability
distribution, , along with the output level, can be used to calculate the expected
total output value for a particular operator on a certain machine, over the planning horizon:
. It should be noted that in addition to calculating the expected
value, a decision maker may be interested to calculate some other characteristics such as
the variance of each operator’s production over the horizon, . A
smaller variance is certainly desirable for planning purposes. All else being equal, an
operator with a lower production variance is more desirable due to the resulting stability.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
131
The calculation of may be done in an alternate way. The aforementioned
calculation method uses , which is a cumulative term. At times, it may be easier or
more intuitive, to use the individual . When we think of the discretized planning
horizon, we see take on the following form:
.
This can be expressed as follows:
, where
and represents running at
least j intervals. For example, the term represents the production of parts in
two time intervals, multiplied by the probability of running at least two time intervals. For
the case with non-zero initial conditions, please refer to Appendix F.
5.3. Empirical Study
Once again, we apply the discussions in this chapter to the case study of Alpha. There can
be many applications for the model developed in Chapter 5, such as production output
maximization and maintenance cost minimization. Similar to the discussions in Chapter 4,
we consider the application of operator assignment optimization, with the aim of
production output maximization. The implementation of the Markov chain model, and the
resulting operator assignment, is more detailed, and thus accurate, method to follow for
operator assignment.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
132
In Section 3.6.6 in Chapter 3, using the extended data set of Alpha, we obtained the
following PHM:
.
In addition to the PHM obtained earlier, we obtained regression equations in Chapter 4 to
forecast the production output, in terms of operator skill level. We express Eq. (1) in
Chapter 4 as follows, in order to incorporate the role of i, one of the dimensions of our
Markov chain states:
,
where
: Output per unit time for machine , given i operational time units
, : number of machines
: coefficients of the main effects, as applicable to machine ,
: coefficients of the interaction terms, as applicable to machine ,
and : Indices iterating through the various operator characteristics considered.
, , where l is the total number of operator characteristics
considered.
Represents the value of operator characteristic , after i operational periods
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
133
The actual regression equations remain unchanged, as expressed earlier in Table 4.1
and repeated here:
Driven 18.664 0 0.051 0.214 0 0
Drive 29.935 0 0 0.086 0.001 0
Ring 23.721 0.090 0 0 0 0.001
Table 5.1: Regression Equation Coefficients, significant at p < 0.01
Lastly, we will use the equations for the operator learning curves, as described earlier in
Table 4.2. These equations are displayed in Table 5.2; however, there is an additional
column here for the “Experience” component of operator skill. In Table 4.2, Experience was
expressed as a constant number, based on operator records of time in the department and
previous work in gear manufacturing. In Table, 5.2, we express Experience using a linear
function, based on the operator experience level at the beginning and the end of third
quarter.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
134
Operator Experience Level (1) Learning Curve and R2 obtained (2)
Social Interaction Analytical Skill
1 100 , 100
2 35.73+0.0298m 69.79
3 21.45+0.0298m 41.11
4 100.00 ,
5 35.73+0.0298m ,
6 35.73+0.0298m ,
7 100.00 ,
8 50.02+0.0298m ,
9 21.45+0.0298m ,
Table 5.2: Learning curves of operators’ skill components
Note (1): at the start and the end of the planning horizon, the experience level of the operator is known, with
certainty, based on his hiring date and transfer date into Alpha. A linear equation is used to forecast values in
between the end points, necessary for the Markov chain model.
Note (2): each equation is forecasted based on 6 points. In the cases where L is represented by a number, the
operator’s scores did not differ in the six points, or that the operator has reached the maximum value of 100
and has been capped off.
The planning horizon considered is the fourth quarter and the unit of time is hours.
Therefore, for each operator, based on 24-hours days, 5-day weeks, and 12-week
quarters. The model is developed using an MTTR of three hours: . Applying the
recursive function developed in Section 5.2, and using the PHM, regression equations, and
learning curves developed for Alpha, we calculate the expected production output of each
operator on each Kappa machine (Table 5.3).
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
135
Operator # Expected Production Output
Driven machine Drive machine Ring machine
1 20,926 21,821 19,030
2 17,223 19,636 16,919
3 14,888 17,616 15,109
4 20,768 21,752 18,973
5 20,125 20,711 17,747
6 17,929 20,478 17,623
7 20,211 20,890 18,146
8 18,621 20,256 17,622
9 20,266 20,782 17,957
Table 5.3: Expected production output for each operator on each machine
As an application of our model, we perform an operator assignment optimization. We
take a mathematical programming approach and define the objective function as the total
revenue over the period. Revenue is maximized over the planning horizon by optimally
assigning operators to machines. The binary decision variable when operator is
assigned to machine (making product ) and is zero otherwise. This model is a simple OA
problem with each operator being assigned to one machine, and each machine having the
correct number of operators, 3, to run three shifts per day.
To date, the department has not used any decision criteria for its OA. Positions have
traditionally been filled on a random basis where operators are transferred to the
department and after the end of their training period, they are assigned to any machine not
fully staffed for all three shifts. Based on this random assignment scenario, we use the
expected operator output of our Markov chain approach, expressed in Table 5.3, to
calculate Alpha’s quarterly revenue. As an example, operators 1, 2, and 3 are assigned to
the Driven line. Therefore, in calculating the total revenue for the “Random” assignment,
the values used from Table 5.3 for operators 1, 2, and 3 are 20926, 17223, and 14888. The
values calculated for these three operators for the Drive and the Ring machines are ignored.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
136
We can also use the results of our Markov chain approach in an IP model for an optimal
OA that maximizes Alpha’s revenue. In addition to comparing our optimal IP approach to
the “random” model, we can compare it to a simple skill-ranking model where the DM
assigns the operators to the machines based on their current skill level, without accounting
for learning curves. Under this simple skill-ranking assignment policy, if operator A’s skill
scores are currently higher than operator B, operator A gets assigned to machine 1 whose
product has a higher sale price than machine 2. As an example, if operators 1 and 2 have
the skill set {71,66,93} and {71,61,90}, respectively, and given the sale price of $110 and $90
for the Driven and the Ring gear, respectively, operator 1 would get assigned to the Driven
machine and operator 2 to Ring. The learning curves of the two operators are not
considered. Consequently, this may be a sub-optimal decision if operator 2 has a steep
positively-sloped learning curve compared to operator 1’s negatively sloped learning curve
for Social and a flat curve for Analytical. In this case, it may be better to assign operator 2 to
Driven because he would produce more over the entire length of the quarter.
We use the expected production output from our Markov chain model (Table 5.3) to
calculate the quarterly revenue. In obtaining the result of our approach, we solve the model
using the LINGO software. The result of our model is compared to the other assignment
policies (Table 5.4) and the significant additional revenues are evident.
Assignment Policy Quarterly Revenue
($)
Revenue difference with optimal OA
($) (%)
Random 16,333,880 565,370 3.5
Simple skill ranking 16,197,900 701,352 4.3
Optimal assignment 16,899,250 - -
worst case 16,090,980 808,270 5.0
Table 5.4: Comparing Alpha’s revenue under various OA approaches
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
137
We re-run our Markov chain model, this time ignoring the effect of learning. Skill scores
are treated as a flat value across the quarter. The operator assignments are different than
when learning is considered. System revenue is calculated to be $16,532,620; this is
$366,630, or 2.2% less revenue over the quarter. This is a relatively large difference in
revenue and points to better model results when we are able to include more information
in the model. The additional information included in the model in this case is the learning
effects.
A further use of our model is to use it to determine the effect of providing additional
training to an operator prior to allowing him to work independently on a machine. We
perform this sensitivity analysis by repeating the expected uptime calculation for the lowest
skilled operator, starting the planning horizon with an analytical skill score that is 10%
higher. This results in additional quarterly revenue of up to $54,158, or as low $8,400,
depending on which machine the operator is assigned to. This information can be used as a
cost-benefit analysis tool on providing machine-specific training to the operators.
5.3.1. Model Validation
Among the many applications of our model in Chapter 5, the particular one we have
discussed in this case study is optimal operator assignment. This is the same goal of the
model discussed in Chapter 4 and since both models are applied to the same data set, it
would be interesting to compare the results of the two approaches. Table 5.5 presents the
results of the two approaches for various operator assignments. The actual operator
assignments for the Random, Simple, and Optimal policies are the same for the two
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
138
approaches. The assignments are different for the worst case scenario. But regardless of
the actual operator assignments, applying the two approaches to the same data set
provides very similar revenue forecasts and this is promising. This comparison of one
approach to the other serves to validate each approach.
Assignment Policy Chapter 5: Quarterly Revenue ($)
Chapter 4: Quarterly Revenue ($)
Difference between results of two approaches
Random 16,333,880 16,232,450 0.6%
Simple skill ranking 16,197,900 16,743,970 -3.4%
Optimal assignment 16,899,250 17,027,400 -0.8%
worst case 16,090,980 16,077,900 0.1%
Table 5.5: Comparing obtained revenue under approaches of Chapters 4 and 5
Upon going through the empirical work presented in Chapters 4 and 5, one may wonder
about the usefulness of Chapter 4 when optimal operator assignment can be achieved with
Chapter 5’s Markov chain model as well. The approach discussed in Chapter 4 uses
deterministic models and as such, it is less complex. In comparison, the Markov chain model
in Chapter 5 is quite complex, with the calculation of the probability of failure at every
interval along the planning horizon. It takes a regular computer about 10 hours to produce
the result of the case study presented in Chapter 5. In the end, both approaches provide
the DM with the optimal operator assignment; but the usage of one approach over the
other is a trade-off between accuracy and simplicity.
For the purposes of validating our model in this chapter, we can also take on a “data-
split” approach. Previously, we had looked at the entire data set covering January 1 to
October 9. The PHM, the learning curve equations, and the regression equations were
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
139
obtained from the data covering this period. For the data split exercise, we consider the
period between January 1 and August 15. Using this shorter period, we obtain new PHM
and regression equations and use them to re-run the Markov chain model. We can then
compare the results with actual production results for the period between August 18 and
October 9 which is unused for the model developments.
Using the data for the period January 1 to August 15, the following new PHM and
regression equations are obtained. In order to keep the model simple, interaction terms are
not considered.
,
where : Social, : Analytical, : , : , : , : , and : .
,
,
,
where : experience, : social, and : analytical. As can be expected, the models based
on seven months of data are slightly different than the equations obtained using the entire
nine months of the data set (with no interaction terms considered), which are the following:
,
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
140
where : Social, : Analytical, : , : , : , : , and : .
,
,
,
where : experience, : social, and : analytical.
We do not randomize the data and consider all but the last eight weeks of the data,
equivalent to about 75% of the total data set. We still want to consider the effect of
learning and, as a result, we cannot randomize the data by date. Furthermore, since the skill
assessments occur in January, April and July, the learning curve equations do not need to be
re-established.
Looking at the actual data records, we determine the number of hours worked by each
operator and use the equivalent number of hours obtained from the Markov chain model.
Based on these operator hours, the production output is presented in table 5.6:
Operator Driven Drive Ring
Model Actual Model Actual Model Actual
1 11,479 11,848
2 9,288 10,094
3 8,623 9,056
4 11,013 11,851
5 11,025 12,440
6 10,302 9,899
7 8,603 8,498 8 9,036 8,911 9 9,037 8,925
Table 5.6: Comparing model results with actual production volumes
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
141
During the regular course of production, the operators were only assigned to one
machine and did not work on the others. As an example, an operator working on the Ring
machine worked on the Ring machine for the entire 9-month period. Therefore, when we
compare the Markov chain results with the actual ones, we are left with many blank cells.
But the nine comparison results we have obtained are quite promising. For example, in
comparing Operator 1’s forecasted versus actual production output, the difference is only
3.2%. When we compare the two sets of model versus actual, we obtain a Pearson R2 of
0.917, significant at the 0.01 level (two-tailed).
5.4. Concluding Remarks and Future Work
We have developed a Markov chain approach to forecast the production output of a
human-machine system, considering human-related factors as well as the learning of the
operators. A planning horizon is considered and discretized; each time interval can have
multiple states, for which a state space is defined. Through the variables defined to
represent the state space, we can uniquely identify each state. The probability of each state
can be calculated from the states immediately before it, regardless of what has occurred
previously. At the end of each time interval, the machine’s status may proceed in one of
two ways. If it is in repair following a failure, it will remain in repair for the duration of the
repair period. If the machine is not in repair, two outcomes are possible: failure and
survival; there is a transition probability associated with each.
We calculate the probability of failure using a proportional hazards model that calculates
the hazard rate based on the machine working age as well as operator-related covariates.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
142
With time, an operator gains experience on the machine and his expertise is likely to
improve, leading to the possible reduction of human error on machine operation. Using the
learning curves for each operator, we calculate the appropriate hazard rate for each level of
cumulative operational periods.
Using a recursive formula and a Markov chain approach, we proceed from one time
interval to the next throughout the entire planning horizon. At the end of the horizon, all
the possible states, along with their probabilities, are used to calculate the expected uptime
over the entire horizon. This quantity, along with the production output for each state, is
used to calculate the expected production output over the entire planning horizon. The
production output for each state is calculated through regression equations, forecasting the
production output in terms of operator skills. Once again, the learning curves are
considered and the appropriate operator skill values are used.
Our work can have several applications. To demonstrate our model and one of its
possible applications, we have discussed the case study of operator assignment
optimization in a manufacturing organization. We calculate the expected output of each
operator on every machine and use these quantities as input in the objective function of a
linear programming model. We optimize this assignment problem to maximize system
revenue. This maximized revenue is compared to revenue obtained based on the current
random assignment practiced in the company as well as an assignment solely based on the
current level of skill, disregarding operator learning.
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
143
Other than the general future work to consider additional HR factors, a possible future
work stemming from the work in this chapter is to consider manufacturing environments
where there are different machine types. This can lead to an additional complexity of
operators having different learning curves on different machines. As the operators work on
different machines and learn new tasks, there may exist a “forgetting” effect for the tasks
they have learned on the machines they have worked on previously. Operator learning and
forgetting is a concept that has been studied in the literature and it will be interesting for us
to consider it within our context.
Another future work is to extend the assumption of a fixed duration for all repairs and
have random repair durations as well. There can be a failure distribution to draw from. In
doing so, we introduce greater complexity in the model but make the model more realistic.
Alternatively, we can stay with the fixed duration for a repair, but consider several repair
scenarios. We can select the top five most common failure modes, have an MTTR for each,
add one more dimension to the Markov chain state for the type of failure, and implement
the same type of analysis described in this chapter. Similar to the other aforementioned
future works, the consideration of several failure modes will yield a more realistic analysis
scenario, resulting in a more accurate estimate of production output.
Similar to Chapter 4, another important future work can be determination of the length
of the planning horizon to consider. Once again, in this chapter, we leave the DM to
determine the length of the planning horizon. In the empirical study, the duration is taken
to be three months, aligned with the quarterly operator assessment cycles. One can work
EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT
144
on providing a systematic tool for choosing an optimal length for the planning horizon
based on machine performance or operator learning.
An additional interesting future work would be to loosen the assumption that operator
learning only takes place while the machine is running. It would be interesting to consider
an environment where operator-driven reliability is practiced and that the operators are
involved in maintenance work. In such a scenario, the operators may gain expertise in
machine operation during the up periods and learn troubleshooting skills during the down
periods. This consideration results in operators gaining various kinds of expertise at all
times when assigned to a machine. The outcome of this scenario can be compared to the
scenario we have considered in this chapter where learning only occurs while the machine
is up.
145
6. CONCLUSION
In this concluding chapter, the central ideas and contributions of the dissertation are
brought together to present a single body. We then proceed to provide the reader with a
recap of the discussions.
6.1. Central Ideas and Contributions
In the introductory chapter, we presented the following figure as the general outline of
the discussions in this dissertation (Figure 1.1).
Our aim has been the analysis and improvement of the performance of human-machine
systems. The central idea of our work is to: 1) develop mathematical models to determine
what, if any, effect exists between various human-related factors and system performance;
and 2) use the models as a systematic tool to make the optimal decisions, given the
decision criteria, such as revenue or system availability.
Figure 1.1: General framework of topics in the dissertation
Quantification
Analysis of failure
risk: short term
intervention
Human-related Factors
Performance
of human-
machine
systems
Optimal operator
assignment
Long term planning
Production
forecasting: risk of
failure and operator
learning
CONCLUSION
146
Our first contribution is to present a novel method to quantify the effects of human-
related factors on the risk of failure in manufacturing industries. There exists a gap in the
literature about the lack of performance measurement models that incorporate human-
related risk assessments; we aim to fill this gap. We choose to work with the proportional
hazards model, a common and versatile tool in condition-based maintenance, and include
human-related factors as covariates. Using the PHM in this manner provides us with a
predictive technique for human reliability.
When failures can be caused by operators, the decision-maker must intervene to
mitigate operator-related risk. There can be numerous intervention methods possible; our
next contribution is to develop a revenue model that provides the decision-maker with a
systematic tool to perform a cost-benefit analysis, balancing the advantage of risk
reduction, against the direct cost of the intervention method. As a result, the DM can
choose the revenue-maximizing intervention method for reducing the probability of failure
stemming from the operators.
The cost-benefit analysis is based on a revenue model that uses the expected uptime
and the probability of machine failure, given the human-related factors. The revenue model
can be used to calculate the failure risk threshold, above which positive revenue is not
expected. Therefore, this risk threshold also serves as a profitability boundary. We use this
boundary to calculate backwards and determine the minimum levels necessary for the
human-related factors. This is another important contribution of our work. An example of
its managerial impact is the possibility of its usage in the certification process for novice
CONCLUSION
147
operators before they are released from the training facility and assigned a machine to
operate independently.
We then expand our focus on system performance by analyzing machine uptime in
addition to the efforts thus far on downtime reduction. There exists a relationship between
operator-related characteristics and machine performance. As an example, a more skilled
operator is likely to have a higher output on the machine. We acknowledge this relationship
to be dynamic as a result of operator learning. We present a method to forecast the
production output, considering this relationship. This Markov chain-based method
incorporates the previously described contributions, and builds on them to achieve the aim
of forecasting the output of a stochastic system where uptime and downtime are both
dependent on operator factors and learning.
The contribution of this work can have significant managerial impact where we can
provide a return-on-investment analysis for training cost versus additional revenue. The
additional revenue would be gained as a result of more output produced by the operators
whose skill is improved due to training. This is an important contribution and can add value
in the interface of operations management and human resource management.
The best way a company can use the research presented in this dissertation would be to
apply the framework of Chapter 4 first to optimally assign the operators. Once the
operators are optimally assigned to the machines, the company would apply the framework
of Chapter 3 to deal with the HR risk present in the system in the short term. The DM would
then apply the model developed in Chapter 5 to forecast each operator’s output for
CONCLUSION
148
planning purposes on an individual basis, such as a cost-benefit analysis of being enrolled in
a training program, or on a system-level, such as the ordering of production material. In
whichever order the framework is applied, the work presented in this dissertation allows a
decision maker to include the role of the human participants, often one of the most
important resources of an organization, in decision making to improve system performance.
6.2. Concluding Remarks and Recap
When we are interested in improving the uptime of a system, whether this entails
increasing the output directly, or decreasing the downtime, all contributing aspects of the
system should be analyzed. Most systems function as a combination of human participants
and hardware. As such, it is necessary to include the role of the human participant in any
kind of analysis in our uptime improvement attempt.
The work presented in this thesis has focused on the risk of system failure, stemming
from the human participants. The human participants can have various roles within the
system. Examples of such participants are managers devising the strategy; they can be the
schedulers in charge of production plans or material purchasing; or they can be
maintenance trade people. The only type of human participants considered in this
dissertation is those with a direct role to play on system failure. In spite of the applicability
of the ideas in this dissertation to different industries, there has been a stronger emphasis
on the manufacturing industry. This is especially evident in the case study discussed
throughout. As such, the example that can be provided for the type of human participant
CONCLUSION
149
considered in this dissertation is a machine operator whose errors can result directly in a
machine failure.
Despite the multidisciplinary nature of the work presented in this dissertation, the
dominant field is still operations research. Therefore, statistical and mathematical models
for performance optimization have a major role to play. For us to be able to implement
human-related factors into these models, they have to be quantified. The quantification
process may be a challenge and various methods to quantify factors that have traditionally
been thought of as subjective, such as skill, have been discussed in Chapter 2. A few of the
well-known human reliability techniques, such as THERP and CREAM, are discussed as
alternatives, before we discuss our methodology. It is based on the framework of the
Critical Incident Technique and uses system experts to assess the skills of the operators.
According to prior works in the literature, it is prevalent to evaluate performance based on
expert judgement. Where experts are knowledgeable and available, the results are reliable.
In chapter 3, we perform a failure risk analysis for short term, such as one shift, and
discuss intervention methods a decision-maker can take in order to reduce operator-
initiated risk. The tool used as a model to calculate the risk of failure is the proportional
hazards model. This tool is used to build a model that aids the decision maker in selecting
the intervention method that is most beneficial to the system. Each intervention method
can reduce the risk by some degree and has a direct cost associated with it. The model we
develop presents the decision maker with a way to perform a cost-benefit analysis. Another
usage of the model is for the calculation of the minimum level of various factors in order to
CONCLUSION
150
ensure system profitability. In the context of our discussions, the decision maker may use
the model to determine a certain level of skill that an operator should be “certified” at
before he/she can start to work independently in the production environment.
Chapter 4 presents an approach for the optimal assignment of operators to machines. In
doing so, there are two main factors we consider. The first is the degree of sensitivity of
each machine to human-related factors. The second is operator learning. The approach is
deterministic and the probability of machine failure is not considered; operators are
assigned based on their current and forecasted characteristics. For each machine,
production output should be forecasted in terms of the factors affecting the operator
working on the machine. Of the various methods that may be available, we choose to use
regression analysis. To capture the effect of learning, we use historical data to build
learning curve equations for the operators. We can then incorporate the regression
equations and the learning curves into a unified model that can forecast the output of each
operator on each machine. The values from these outputs are used in a linear programming
model to solve the assignment problem. The framework we develop is applied to a case
study to show the savings that can be realized if our model is implemented.
The work presented in Chapter 5 is used to forecast the production out over a planning
horizon. The planning horizon considered is long term and we divide into small time
intervals. Analyzing the risk of machine failure is performed over each interval. Similar to
Chapter 3, the proportional hazards model is used for this failure analysis. The main tool
used to develop the framework of the chapter’s model is a Markov chain. The model
CONCLUSION
151
provides the decision maker with a prediction of the production output of each operator on
each machine. The decision maker can use this model directly for planning purposes or
he/she can perform sensitivity analysis to determine what benefits may be gained by
making changes to the current state. An example of a direct usage is optimal operator
assignment. An example of sensitivity analysis is forecasting a production output increase as
a result of a training program’s skill improvement.
152
REFERENCES
1. Adamides, E. D., Stamboulis, Y. A., Valeris, A. G. (2004). Model-based Assessment of
Military Aircraft Engine Maintenance Systems. Journal of the Operational Research
Society, 55: 957-967.
2. Agrote, L., Insko, C. A., Yovetich, N., Romero, A. A. (1995). Group Learning Curves:
the Effects of Turnover and Task Complexity on Group Performance. Journal of
Applied Social Psychology, 25(6): 512-529.
3. Allen, T. J. (1977). Managing the Flow of Technology. MIT Press: Boston, MA, USA.
4. Ash, R.A., Levine, E.L., (1985). Job Applicant Training and Work Experience
Evaluation: An Empirical Comparison of Four Methods. Journal of Applied
Psychology, 70: 572-576.
5. Baines, T. S., Asch, R., Hadfield, L., Mason, J. P., Fletcher, S., and Kay, J. M. (2005).
Towards a Theoretical Framework for Human Performance Modeling Within
Manufacturing Systems Design. Simulation Modeling Practice and Theory, 13: 486–
504.
6. Barroso, M., and Wilson, J. (1999). HEDOMS – Human Error and Disturbance
Occurrence in Manufacturing Systems: Towards Development of an Analytical
Framework. Human Factors and Ergonomics in Manufacturing, 9 (1): 87–104.
7. Bendig, A.W., (1952a). A Statistical Report on a Revision of the Miami Instructor
Rating Sheet. Journal of Educational Psychology, 43: 423-429.
REFERENCES
153
8. Bendig, A.W., (1952b). The Use of Student Rating Scales in the Evaluation of
Instructions in Introductory Psychology. Journal of Educational Psychology, 43: 167-
175.
9. Bendig, A.W., (1953). Reliability of Self-Ratings as a Function of the Amount of
Verbal Anchoring and of the Number of Categories on The Scale. Journal of Applied
Psychology, 37: 38-41.
10. Bendoly, E., Prietula, M., (2008). In “the zone”: The role of evolving skill and
workload on motivation and realized performance in operational tasks. Journal of
Operations & Production Management, 28 (11-12), 1130–1152.
11. Biskup, D., (2008). A State-of-the-art Review on Scheduling with Learning Effects.
European Journal of Operational Research, 188: 315-329.
12. Blanks H (2007). Quality and Reliability into the Next Century. Quality and Reliability
Engineering International, 10(3): 179-184.
13. Blau, F. D., Kahn, L. M. (1996). International Differences in Male Wage Inequality:
Institutions versus Market Forces. Journal of Political Economy, 104(4): 791-837.
14. Bluhm, K. (2001). Exporting or Abandoning the `German Model'?: Labour Policies of
German Manufacturing Firms in Central Europe. European Journal of Industrial
Relations, 7(2): 153-173.
15. Blumenfeld, D. E., and Inman, R. R. (2009). Impact of Absenteeism on Assembly Line
Quality and Throughput. Production and Operations Management, 18 (3): 333-343.
REFERENCES
154
16. Blumenfeld, P. C., Marx, R. W., Soloway, E., Krajcik, J., (1996). Learning with Peers:
From Small Group Cooperation to Collaborative Communities. Educational
Researcher, 25(8): 37-40.
17. Borman, W. C., Dunette, M. D., (1974). Behavior-based Versus Trait-Oriented
Performance Ratings: An Empirical Study. Journal of Applied Psychology, 60: 561-
565.
18. Bowerman, B. L., O’Connel, R. T., (1990). Linear Statistical Models: An Applied
Approach (2nd edition). Druxbury: Belmont, CA, USA.
19. Brown, A. L., Palincsar, A. S., (1989). Guided, cooperative learning and individual
knowledge acquisition. In L. B. Resnick (Ed.), Knowing, learning, and instruction:
Essays in honor of Robert Glaser. Erlbaum: Hillsdale, NJ, USA.
20. Bryson, A., Forth, J., (2007). Are There Day of the Week Productivity Effects? Centre
for Economic Performance, Manpower Human Resources Lab: Document number
MHRLdp004, London School of Economics.
21. Bubb. H., (2005). Human Reliability: A Key to Improve Quality in Manufacturing.
Human Factors and Ergonomics in Manufacturing, 15: 353-363.
22. Burkolter, D., Kluge, A., Sauer, J., Ritzmann, S., (2009). The Predictive Qualities of
Operator Characteristics for Process Control Performance: the Influence of
Personality and Cognitive Variables. Ergonomics, 52 (3): 302-311.
23. Burnham, K. P., Anderson, D. R., (2004). Multimodel Inference: Understanding AIC
and BIC in Model Selection. Sociological Methods and Research, 33: 261-304.
REFERENCES
155
24. Butterfield, L. D., Borgen, W. A., Amundson, N. E., Maglio, A. T. (2005). Fifty Years of
the Critical Incident Technique: 1954-2004 and Beyond. Qualitative Research, 5: 475-
496.
25. Cacciabue, P. C. (2005). Human Error Risk Management Methodology for Safety
Audit of a Large Railway Organisation. Applied Ergonomics, 36(6): 709-718.
26. Cacciabue, P.C., (2000). Human Factors Impact on Risks Analysis of Complex
Systems. Journal of Hazardous Materials, 71: 101-116.
27. Carr, M. J., Christer, A. H. (2003). Incorporating the Potential for Human Error in
Maintenance Models. Journal of the Operational Research Society, 54(12): 1249-
1253.
28. Castanier, B., Berenguer, C., Grall, A., (2003). A Sequential Condition-based
Repair/Replacement Policy with Non-periodic Inspections for a System Subject to
Continuous Wear. Applied Stochastic Models in Business and Industry. 19 (4), 327-
347.
29. Centrone, D., Kiassat, C., Garetti, M., Banjevic, D., Jardine, A. K. S. (2010).
Proportional Hazards Model: A Valuable Methodology for Sustainable
Manufacturing. Proceedings of Maintenance for Sustainable Manufacturing (M4SM)
conference, Verona, Italy. 51-56.
30. Chang, Y. H., Wang, Y. C., (2010). Significant Human Risk Factors in Aviation
Maintenance Technicians. Safety Science, 48: 54-62.
31. Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45 (12), 1304-
1312.
REFERENCES
156
32. Cohen, J. (1994). The earth is round (p < 0.05). American Psychologist, 49 (12), 997-
1003.
33. Conway, J. M., Huffcutt, A. I. (1997). Psychometric Properties of Multisource
Performance Ratings: A Meta-Analysis of Subordinate, Supervisor, Peer, and Self-
Ratings. Human Performance, 10(4): 331-360.
34. Crowder, M., (2012). Multivariate Survival Analysis and Competing Risks. Taylor &
Francis Group, Boca Raton, FL, USA.
35. Davis, P. J. (2006). Critical Incident Technique: A Learning Intervention for
Organizational Problem Solving. Development and Learning in Organizations, 20(2):
13-16.
36. Davis, D. A., Mazmanian, P. E., Fordis, M., Van Harrison, R., Thorpe, K. E., Perrier, L.,
(2006). Accuracy of Physician Self-assessment Compared With Observed Measures
of Competence. Journal of American Medical Association, 296(9): 1094-1102.
37. Dhillon, B., Liu, Y. (2006). Human Error in Maintenance: A Review. Journal of Quality
in Maintenance Engineering, 12(1): 21-36.
38. Dutton, J. M., Thomson, A., (1984). Treating Progress Functions as a Managerial
Opportunity. The Academy of Management Review, 9(2): 235-247.
39. Elmaraghi, W. H., Nada, O. A., Elmaraghi, H. A., (2008). Quality Prediction for
Reconfigurable Manufacturing Systems via Human Error Modelling, 12(5): 584-598.
40. Embrey, D.E. (1986). SHERPA: A Systematic Human Error Reduction and Prediction
Approach. International Topical Meeting on Advances in Human Factors in Nuclear
Power Plant Systems, Knoxville, Tennessee.
REFERENCES
157
41. Emrouznejad, A., Zerafat Angiz, L. M., Ho, W., (2012). An Alternative Formulation for
the Fuzzy Assignment Problem. Journal of the Operational Research Society. 63(1):
59-63.
42. Feiring, B.R., (1993). A Model Generation Approach to the Personnel Assignment
Problem. Journal of Operational Research Society, 44(5): 503-512.
43. Fiedler, F.E., (1970). Leadership Experience and Leader Performance: Another
Hypothesis Shot To Hell. Organizational Behavior and Human Performance, 5: 1-14.
44. Field, A., (2005). Discovering Statistics Using SPSS (2nd edition). Sage Publications,
London, UK.
45. Flanagan, J., (1954). The Critical Incident Technique. Psychology Bulletin, 51: 327-
358.
46. Fritzsche, R., (2012). Cost Adjustment for Single Item Pooling Models Using a
Dynamic Failure Rate: A Calculation for the Aircraft Industry. Transportation
Research Part E, 48: 1065-1079.
47. Gasmi, S., Love, C. E., Kahle, W.,, (2003). A General Repair, Proportional-Hazards,
Framework to Model Complex Repairable Systems. IEEE Transactions on Reliability.
52(1), 26-32.
48. Glick, W.H., Jenkins, G.D., Jr., Gupta, N., (1986). Method Versus Substance: How
Strong Are Underlying Relationships between Job Characteristics and Attitudinal
Outcomes? Academy of Management Journal, 29: 441-464.
REFERENCES
158
49. Haas, C.T., Morton, D.,P., Tucker, R.L., Gomar, J.E., Terrien, R.K., (2000). Assignment
and Allocation Optimization of Partially Multiskilled Workforce. Center for
Construction in Industry Studies, Report 13.
50. Hancock, P. A., (1986). Stress and adaptability. G.R.J. Hockey, A.W.K. Gaillard, M.G.H.
Coles, eds., Energetics and Human Information Processing. Martinus Nijjhoff,
Dordrecht, The Netherlands, 243–251.
51. Hannaman, G. W., Spurgin, A. J., (1984). Systematic Human Action Reliability
Procedure (SHARP). EPRI NP-3583, Project 2170-3, Interim Report, NUS
Corporation, San Diego, CA, US.
52. Harvey, R.J., Billings, R.S., Nilan, K.J., (1985). Confirmatory Factor Analysis of the Job
Diagnostic Survey; Good News and Bad News. Journal of Applied Psychology, 70:
461-468.
53. Hollnagel, E., (1998). Cognitive Reliability Error Analysis Method (CREAM). Elsevier
Science: New York, USA.
54. Horberry, T. J., Burgess-Limerick, R., Steiner, L., (2010). Human Factors for the
Design, Operation, and Maintenance of Mining Equipment. Taylor and Francis
Group: Boca Raton, FL, USA.
55. Hsie, M., Hsiao, W., Cheng, T., and Chen, H., (2009). A Model Used in Creating a
Work-Rest Schedule for Laborers. Automation in Construction, 18(6): 762–769.
56. Hunter, J.E., Hunter, R.E., (1984). Validity and Utility of Alternative Predictors of Job
Performance. Psychological Bulletin, 96: 72-98.
REFERENCES
159
57. Iakovou, E., Ip, C. M., Koulamas, C., (1999). Throughput-dependent Periodic
Maintenance Policies for General Production Units. Annals of Operations Research,
91: 41-47.
58. Jardine, A. K. S., Ralston, P., Reid, N., Stafford, J., (1989). Proportional Hazards
Analysis of Diesel Engine Failure Data. Quality and Reliability Engineering
International, 5(3): 207-216.
59. Jardine, A. K. S., Banjevic, D., (2005). Interpretation of Inspection Data Emanating
from Equipment Condition Monitoring Tools: Method and Software, in
Mathematical and Statistical Methods in Reliability, Armijo, Y. M, (Ed) World
Scientific Publishing Company: Singapore.
60. Jardine, A. K. S., Buzacott, J. A., (1985). Equipment Reliability and Maintenance.
European Journal of Operational Research, 19 (3): 285-296.
61. Jardine, A. K. S., Banjevic, D., Makis, V., (1997). Optimal Replacement Policy and
Structure of Software for Condition-Based Maintenance. Journal of Quality in
Maintenance Engineering, 3: 109-119.
62. Jenkins, G.D., Taber, T.A., (1977). A Monte Carlo Study of Factors Affecting Three
Indices of Composite Scale Reliability. Journal of Applied Psychology, 62: 392-398.
63. Kao, C., Lee, H.T., (1996). An Integration Model for Manpower Forecasting. Journal
of Forecasting, 15: 543-548.
64. Karaulova, T., Pribytkova, M., (2009). Reliability Prediction for Man-Machine
Production Lines. DAAAM International Scientific Book. DAAAM International
Vienna, 487-500.
REFERENCES
160
65. Kariuki, S. G., Loewe, K., (2007). Integrating Human Factors into Process Hazard
Analysis. Reliability Engineering and System Safety, 92: 1764–1773.
66. Kiassat, C., Safaei, N., (2009). Integrating Human Reliability Analysis into a
Comprehensive Maintenance Optimization Strategy. Proceedings of the World
Congress on Engineering Asset Management, Athens, Greece. 561-566.
67. Kim, J. W., Jung, W., (2003). A Taxonomy of Performance Influencing Factors for
Human Reliability Analysis of Emergency Tasks. Journal of Loss Prevention in the
Process Industries, 16, 479-495.
68. Kim, M. C., Seong, P., H., Hollnagel, E., (2004). A Probabilistic Approach for
Determining the Control Mode in CREAM. Reliability Engineering and System Safety,
91, 191-199.
69. Knauth, P. (1996). Designing Better Shift Systems. Applied Ergonomics, 27(1): 39–44.
70. Kolarik, W. J., Woldstad, J. C., Lu, S., (2004). Human Performance Reliability: On-line
Assessment Using Fuzzy Logic. IIE Transactions, 36(5): 457-467.
71. Kostreva, M., McNelis, E., Clemens, E. (2002). Using a Circadian Rhythms Model to
Evaluate Shift Schedules. Ergonomics, 45(11), 739–763.
72. Lam, T. C. M., Klockars, A. J. (1982). Anchor Point Effects on the Equivalence of
Questionnaire Items. Journal of Educational Measurement, 19(4): 317-322.
73. Lamond, N., Dorian, J., Burgess, H. J., Holmes, A. L., Roach, G. D., McCulloch, K.,
Fletcher, A., Dawson, D., (2004) Adaptation of Performance During a Week of
Simulated Night Work. Ergonomics, 47(2): 154-165.
REFERENCES
161
74. Landy, F.J., Trumbo, D.A., (1975). Psychology of Work Behavior. Dorsey Press:
Homewood, IL, USA.
75. Leopairote, K. (2003). Policies for Multi-Skilled Worker Selection, Assignment, and
Scheduling. Doctoral Dissertation, University of Wisconsin-Madison.
76. Levine, E.L., Ash, R.A., Bennett, N., (1980). Evaluation and Use of Four Job Analysis
Methods for Personnel Selection. Journal of Applied Psychology, 65: 524-535.
77. Levine, E. L., Ash, R. A., Hall, H., Sistrunk, F., (1983). Evaluation of Job Analysis
Methods by Experienced Job Analysts. Academy of Management Journal, 26(2): 339-
348.
78. Li, C-L., Cheng, T. C. E., (1994). An Economic Production Quantity Model with
Learning and Forgetting Considerations. Production and Operations Management
3(2): 118-132.
79. Li., N., Li, X. L., (2000). Modeling staff Flexibility: a Case of China. European Journal of
Operational Research, 124(2): 255-266.
80. Lissitz, R.W., Green, S.B., (1975). Effect of the Number of Scale Points on Reliability:
A Monte Carlo Approach. Journal of Applied Psychology, 60: 10-13.
81. Lugtigheid, D., Banjevic, D., Jardine, A. K. S. (2004). Modelling Repairable System
Reliability with Explanatory Variables and Repair and Maintenance Actions, Journal
of Management Mathematics, 15(2): 89-110.
82. Malhotra, M. K., Fry, T. D., Kher, H. V., Donohue, J. M., (1993). The Impact of
Learning and Labor Attrition on Worker Flexibility in Dual Resource Constrained Job
Shops. Decision Sciences, 24(3): 641-663.
REFERENCES
162
83. Nembhard, D. A., Norman, B. A., (2002). Worker Efficiency and Responsiveness in
Cross-Trained Teams. Technical Report 02-02, Department of Industrial Engineering,
University of Pittsburgh.
84. Onkham, W., Karwowski, W., Ahram, T. Z., (2012). Economics of Human
Performance and Systems Total Ownership Cost. Work, 41: 2781-2788.
85. Ornstein, M., (1998). Questionnaire Design. Current Sociology, 46(4): 7-47.
86. Parker, S. K., Wall, T. D., Cordery, J. L., (2001). Future Work Design Research and
Practice: Towards an Elaborated Model of Work Design. Journal of Occupational and
Organizational Psychology, 74: 413–440.
87. Peng, Y., Dong, M., (2011). A Prognosis Method Using Age-Dependent Hidden Semi-
Markov Model for Equipment Health Prediction. Mechanical Systems and Signal
Processing, 25(1): 237-252.
88. Philipose, S., (1993). R&D Manpower Forecasting For Chemical Industries in India.
IEEE Transactions on Engineering Management, 40: 187-191.
89. Pillay, A., Wang, J., (2003). Modified Failure Mode and Effects Analysis Using
Approximate Reasoning. Reliability Engineering and System Safety, 79: 69-85.
90. Podsakoff, P.M., Organ, D.W., (1986). Self-reports in Organizational Research:
Problems and Prospects. Journal of Management, 12(4): 531-544.
91. Quinones, M.A., Ford, J.K., Teachout, M.S., (1995). The Relationship between Work
Experience and Job Performance: a Conceptual and Meta-Analytic Review. Personnel
Psychology, 48: 887-910.
REFERENCES
163
92. Rasmussen, J., (1982). Human Errors: a Taxonomy for Describing Human
Malfunction In Industrial Installations, Journal of Occupational Accidents, 4: 311-333.
93. Rausand, M., Hoyland, A., (2004). System Reliability Theory, Models, Statistical
Methods, and Applications (2nd Edition). Wiley: Hoboken, New Jersey, USA.
94. Reason, J.T., (1987). Generic Error Modelling System: A Cognitive Framework for
Locating Common Human Error Forms. In: J. Rasmussen et al. (Eds.), New
Technology and Human Error. Wiley: Chichester, UK.
95. Reer, B., (1994). A Probabilistic Method for Analyzing the Reliability Effect of Time
and Organizational Factors. European Journal of Operational Research, 75(3): 521-
539.
96. Seyed-Hosseini, S. M., Safaei, N., Asgharpour, M. J., (2006). Reprioritization of
Failures in a System Failure Mode and Effects Analysis by Decision Making Trial and
Evaluation Laboratory Technique. Reliability Engineering and System Safety, 91: 872-
881.
97. Singh, J. Fleming, L. (2010). Lone Inventors as Sources of Technological
Breakthroughs: Myth or Reality? Management Science, 56(1): 41-56.
98. Sisson, G. R., (2001). Hands-on Training: A Simple and Effective Method for On-the-
job Training. Berrett-Koehler Publishers: San Francisco, CA, USA
99. Soller, A. L. (2001). Supporting Social Interaction in an Intelligent Collaborative
Learning System. International Journal of Artificial Intelligence in Education, 12 (1):
40-62.
REFERENCES
164
100. Stanton, N. A., Salmon, P. M., Walker, G. H., Baber, C., Jenkins, D. P., (2005). Human
Factors Methods: A Practical Guide for Engineering and Design. Ashgate: Aldershot,
UK.
101. Stewart, D. M., Grout, J. R. (2001) The Human Side of Mistake-proofing. Production
and Operations Management, 10(4): 440-459.
102. Swain, A., Guttmann, H., (1983). Handbook on Human Reliability Analysis with
Emphasis on Nuclear Power Plant Application NUREG/CR-1278 US Nuclear
Regulatory Commission.
103. Swanson, R. A. Sawzin, S. A., (1975). Industrial Training Research Project. Bowling
Green State University. Bowling Green, OH.
104. Teyarachakul, S., Chand, S., Ward, J., (2011). Effect of Learning and Forgetting on
Batch Sizes. Production and Operations Management, 20(1): 116-128.
105. Venezia, I., (1985). On the Statistical Origins of the Learning Curve. European Journal
of Operational Research, 19: 191-200.
106. Vidic, N., (2008). Developing Methods to Solve the Workforce Assignment Problem
Considering Worker Heterogeneity and Learning and Forgetting. Doctoral
Dissertation, University of Pittsburgh.
107. Vlok, P. J., Coetzee J. L., Banjevic, D., Jardine, A. K. S., Makis, V., (2002). Optimal
Component Replacement Decisions Using Vibration Monitoring and the PHM.
Journal of the Operational Research Society, 53: 193-202.
108. Vrignat, P., Avila, M., Duculty, F., Kratz, F., (2012). Maintenance Policy: Degradation
Laws versus Hidden Markov Model Availability Indicator. Proceedings of the
REFERENCES
165
Institution of Mechanical Engineers, Part 0: Journal of Risk and Reliability. 226 (2),
137-155.
109. Wagner, E.E., Hoover, T.O., (1974). The Influence of Technical Knowledge on Position
Error in Rankings. Journal of Applied Psychology, 59: 406-407.
110. Wang, J., (2005). A Review of Operations Research Applications in Workforce
Planning and Potential Modeling of Military Training. Australian Government
Department of Defense, Systems Sciences Laboratory, DSTO-TR-1688.
111. Webb, G. K., (1994). Integrated Circuit (IC) Pricing. High Technology Management
Research, 5: 247-260.
112. Wickens, C. D., Lee, J., Liu, Y. D., Gordon-Becker, S., (2004). An Introduction to
Human Factors Engineering (2nd edition). Prentice Hall: Upper Saddle River, NJ, USA.
113. Williams, R. (2004). An Introduction to the UK Time Use Survey from A Labour
Market Perspective. Labour Market Trends, February: 63-70.
114. Woodman, R. W., J. E Sawyer, Griffin, R. W., (1993). Toward a Theory of
Organizational Creativity. Academy of Management Review, 18(2): 293-321.
115. Yang, T., Lee, R-S., Hsieh, C., (2003). Solving a Process Engineer’s Manpower-
planning Problem Using Analytic Hierarchy Process. Production Planning & Control,
14(3): 266-272.
116. Yelle, L. E., (1979). The Learning Curve: Historical Review and Comprehensive Survey.
Decision Sciences, 10(2): 302-328.
117. Zeng, X., Wong, W-K., Leung, S.Y-S., (2011). An Operator Allocation Optimization
Model for Balancing Control of the Hybrid Assembly Lines Using Pareto Utility
REFERENCES
166
Discrete Differential Evolution Algorithm. Computers and Operations Research. 39:
1145-1159.
118. Zimolong, B., Trimpop, R., (1994). Managing Human Reliability in Advanced
Manufacturing Systems, in Design of Work and Development of Personnel in
Advanced Manufacturing, Salvendy, G., and Karwowski, W., (eds.) John Wiley &
Sons: New York, NY, USA.
119. Zuashkiani, A., Banjevic, D., Jardine, A. K. S., (2009). Estimating Parameters of
Proportional Hazards Model Based on Expert Knowledge and Statistical Data. Journal
of the Operational Research Society, 60(12): 1621-1636.
167
APPENDIX A1: Observational Study
An expert observes an operator during the performance of routine duties on a Kappa
machine. The expert fills out this questionnaire for the operator’s operating and
troubleshooting of the machine.
Subject observed:
Product line assignment of subject:
Date:
Shift:
Category 1: Tool Change
Procedure followed:
1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless
Time taken to perform work:
1. Very slow (over 2 hours) 2. Slow 3. Average 4. Quick 5. Very quick (less than 45 minutes)
Machine set-up after tool change:
1. Many scrap pieces made (over 10) 2. More than average number of scrap pieces made 3. Less than average number of scrap pieces made 4. Hardly any scrap pieces made (less than 3)
Severity of worst error committed:
1. Will cause a major issue or delay 2. Will cause a minor issue or delay 3. May affect machine or cause a delay 4. Will likely go unnoticed
APPENDICES
168
Category 2: Product Measurement
Perform proper measurement procedure:
1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless
React by entering proper machine axis correction:
1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless
Recognize the need for out of process measurement:
1. Did not notice 2. Noticed late 3. Noticed right away
Severity of worst error committed:
0. Will cause a major issue or delay 1. Will cause a minor issue or delay 2. May affect machine or cause a delay 3. Will likely go unnoticed
Category 3: Troubleshooting
Dealing with machine software issues:
1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away
Dealing with machine hardware issues:
1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away
APPENDICES
169
Dealing with product visual inspection issues:
1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away
Dealing with product measurement values jumping unexpectedly:
1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away
APPENDICES
170
APPENDIX A2: System Experts Assessing Operators
An expert fills out this questionnaire for the main purpose of assessing the operation and
troubleshooting of an operator working on the Kappa machines. There are also some
questions on the expert’s assessment of the operator’s social interaction skills.
Operator name:
Product line assignment of operator:
Rate the operator on the following:
Section 1: Analytical Skills
1. Expected production per shift a) Very poor (<150 gears per shift) b) Poor c) Acceptable d) Good e) Very good (>300 gears per shift)
2. Expected downtime due to tool change a) Very poor (> 2 hours) b) Poor c) Acceptable d) Good e) Very good (< 45 minutes)
3. Expected scrap gears made during set-up a) Very poor (>10 gears) b) Poor c) Acceptable d) Good e) Very good (<3 gears)
4. Trouble shooting abilities a) Very poor (cannot deal with any non-routine situations) b) Poor c) Acceptable d) Good e) Very good (can trouble shoot all operator-related issues)
APPENDICES
171
5. Confidence in catching quality problems
a) Very poor (many instances of catching problems after the fact) b) Poor c) Acceptable d) Good e) Very good (hardly any problems have ever been found after the fact)
6. Confidence in detecting anomalies with machine components
a) Very poor (hardly ever) b) Poor c) Acceptable d) Good e) Very good (almost always)
Section 2: Social Skills
7. There is a machine problem the operator cannot resolve. How likely is he to ask a colleague for
the solution rather than restarting the machine to eliminate the problem?
0 …………..………..... 1 ……..…………..…. 2 ..………..…………. 3 ….………………….... 4
Not likely at all Almost certainly
8. There is a machine problem the operator cannot resolve. How likely is he to ask a supervisor or an
engineer for the solution rather than restarting the machine to eliminate the problem?
0 …………..………..... 1 ……..…………..….. 2 ..………..…………. 3 ….………………….. 4
Not likely at all Almost certainly
9. How eager will the operator be in shadowing another operator during a complete tool change
cycle, knowing he may learn a new technique?
0 …………..………..... 1 ……..…………..….. 2 ..………..………….. 3 ….…………………... 4
Not comfortable at all Completely eager
10. How likely is the operator to share a newly learned technique with colleagues, if not directly
asked by them?
0 …………..………..... 1 ……..…………..….. 2 ..………..………….. 3 ….…………………... 4
Not likely at all Almost certainly
APPENDICES
172
APPENDIX A3: Self-assessment Questionnaire
Operators fill out this questionnaire for the main purpose of assessing their technical
knowledge on the operation and troubleshooting of the Kappa machines. There are also
some questions on their general experience level and social interaction skills.
Operator name:
Product line assignment of operator:
Section 1: Gear manufacturing experience level
1. When you first joined the department, what was your level of experience with gear manufacturing? a) No machining experience. Assembly line experience only b) Machining experience, but not gears. Engines and engine components only c) Some gear machining experience, less than one year d) Some gear machining experience, more than one year
Section 2: Interaction
1. There is a machine problem you cannot resolve. How likely are you to ask a colleague for the solution rather than restarting the machine to eliminate the problem? 0 …………..………..... 1 ……..…………..…... 2 ..………..…………. 3 ….………………. 4
Not likely at all Almost certainly
2. There is a machine problem you cannot resolve. How likely are you to ask a supervisor/engineer for the solution rather than restarting the machine to eliminate the problem? 0 …………..………..... 1 ……..…………..…... 2 ..………..………. 3 ….…………………. 4
Not likely at all Almost certainly
3. How eager will you be in shadowing another operator during a complete tool change cycle, knowing you may learn a new technique? 0 …………..………..... 1 ……..…………..…... 2 ..………..………… 3 ….………………. 4
Not comfortable at all Completely eager
4. How likely are you to share a newly learned technique with colleagues, if not directly asked by them? 0 …………..………..... 1 ……..…………..…... 2 ..………..………… 3 ….………………. 4
Not likely at all Almost certainly
APPENDICES
173
Section 3: Troubleshooting
1. Looking at the product measurement chart, you notice the minimum chamfer lines are too steep. What do you do? a) Re-dress the cutting tool b) Enter a stock-division correction c) Change the tool d) Enter a Lead angle correction
2. During your visual inspection of the last piece made, you notice a corner of the gear tooth face is not cleaned up. What do you do? a) Check for run-out as a first step to ensure good incoming parts. b) Enter a Lead angle correction c) Enter an Involute angle correction d) Enter a stock-division correction
3. The top few millimeters of the gear tooth has a different finish to it than the rest of the surface. What do you check? a) Lead angle in the machine is excessive b) Involute angle in the machine is excessive c) Z-axis adjustment is excessive d) Part has a Stock-division problem from a previous process
4. During the cutting cycle, the machine makes a loud humming sound. What is the likely cause? a) The Z-axis most likely needs to be replaced b) The W axis most likely needs to be replaced c) The parts may be on the large side of the tolerance specification d) The parts may be on the small side of the tolerance specification
Section 4: Tool Change
1. How do you get the count for cutting cycles of the tool? a) Look at the last count on the tool sheet and estimate the production run since b) Go to the “Run in warm-up” menu c) Go to the “Cutting cycle” menu d) Go to the “Axis adjustment” menu
2. What is the first step after physically installing the new Dressing wheel? a) Input the dresser specific information b) Dress the cutting wheel once c) Put away the old Dressing wheel d) Initiate a warm-up cycle
3. What is the maximum number of cycles on a cutting tool before it gets dedicated to either model of car or truck? a) 50 b) 150 c) 250 d) 500
APPENDICES
174
4. During visual inspection of the cutting tool, what do you look for? a) Diamond coating off the tool surface on all teeth b) Diamond coating off the tool surface on at least 3 teeth in a row c) Diamond coating off the tool surface on at least 5 teeth in different parts of the tool d) Broken or chipped teeth on the tool
Section 5: Product Measurement
1. What test do you select on the measurement machine to measure the pulse signature left by the tool vibrations? a) Undulations b) CX Bias c) KX Bias d) Lead-Involute
2. The Crown value is approaching the upper limit of tolerance. What do you do? a) Change the tool b) Put in a direct correction c) Change the Lead angle to affect the Crown d) Change the Involute angle to affect the Crown
3. The last part checked has an asymmetric undercut. Where do you put in a correction? a) Involute angle b) Lead angle c) Stock division d) Crown
4. The Bias reading on the last part checked was in the last 10% of tolerance. What is your next step? a) Change the tool b) Enter an Involute angle correction c) Enter a Lead angle correction d) Enter an additional cutting time
APPENDICES
175
APPENDIX B: Discussion on Logistic Regression as a Validation Tool
We provide further details to the usage of another tool, logistic regression, in order to
validate the PHM’s prediction of machine failure risk.
The risk analysis model discussed in Section 3.6.3 used the PHM as a method. The
validity of this work can be confirmed by using another tool, logistic regression, to achieve
the same goal of predicting the risk of machine failure. Logistic regression is a multiple
regression but the dependent variable is a categorical dichotomy (Field, 2005). In the
context of the analysis discussed thus far, we can think of this dichotomy as machine failure
or not. The independent variables for the logistic regression can be continuous or
categorical. In our context, the example of a continuous variable is the analytical skill score;
the example of a categorical variable is the binary variable that represents working on a
certain shift. Unlike multiple regression, where the value of the dependent variable Y is
predicted from one or several independent variables, Xi, logistic regression predicts the
probability of Y occurring given the independent variable(s) Xi. The general form of logistic
regression is the following:
0 1 1 2 2
1( )
1 exp[ ( ... )]n n i
P Yb b X b X b X
The same data set analyzed by PHM can be analyzed using logistic regression. The
dependant variable, Y, is a binary {0,1} variable and takes on the value of 1 when there is a
failure in the shift and is zero otherwise. We use the same six independent variables as in
the PHM, with the details provided in Table 3.4. Our aim is to compare the probability of
APPENDICES
176
failure calculated by two tools, PHM and logistic regression, on the same data set. Ideally,
both tools should prompt the DM to take the same action, and this can act as a validation of
either tool. For this tool comparison to be meaningful, we have to use the same
independent variables. We also have to ensure the units of measure for the two methods
are the same. For example, PHM may have 1 hour as its unit of calculation for the hazard
rate, compared to the logistic regression calculating the probability of the event (failure)
over one shift (consisting of eight hours). In such an example, the probabilities obtained
from the PHM are multiplied by 8 for a fair comparison.
The dataset for the Ring gear machine is analyzed by logistic regression using the SPSS
software. When we use the same independent variables as in the PHM obtained in Eq. (10)
in Chapter 3, we obtain the following logistic regression equation, whose parameter
estimates from SPSS is shown in Table B1:
Variable Represented Parameter Sig.
X1 Night Shift 8.811 0.032
X2 Afternoon Shift-Social 0.007 0.139
X3 NS-Analytical (A) 0.065 0.068
X4 NS-Social (S) -0.113 0.039
X5 NS-Experience (E) -0.080 0.041
X6 Experience-Analytical 0.000 0.330
Constant - -2.756 0.000
Table B1: Summary of Estimated Parameters
APPENDICES
177
The Nagelkerke’s R2 is a method used in logistic regression to assess the correlation
between the predicted and observed values of the outcome (Field, 2005); it is found to be
0.026. It is expected to have a very low model fit, as measured by the R2 value. The ratio of
shifts containing a failure event to shifts without one is quite low (about 5%). In addition,
we normally expect human-related factors to affect only a small percentage (about 10%) of
failures. Furthermore, a last factor that may play a role is the fact that variable X2 is not
found to be significant in the logistic regression model (p-values > 0.1) but is included
because it was included in the PHM. Due to these reasons, it is expected to get a low value
of R2.
The hazard rates have been placed in three categories based on the course of action a
DM may take. Risk factors calculated from two situations may be drastically different but
lead to the same decision. For example, consider a case where we analyze two scenarios
yielding hazard rates of 0.0002 and 0.004; both would lead to the DM ignoring the risk,
even though one is 20 times larger in magnitude. Therefore, if we think of three categories
of “do nothing”, “monitor”, and “intervene”, the corresponding categories have been
arbitrarily selected to be “< 5%”, “5% ≤ risk < 10%”, and “≥ 10%”, respectively. The category
ranges are the same for both models. We are interested to check whether a low prediction
of hazard by the PHM is confirmed by a low probability of failure by logistic regression. The
comparison is promising as a Kendall’s correlation coefficient of 0.615 is found to be
significant (p-value < 0.01). The positive sign of this coefficient, as well as its significance,
indicate that the two tools are making the same general predictions.
APPENDICES
178
In Section 3.6.6.3 , we obtained another PHM with the expanded data set, considering all
three machines, and an additional factor of day-of-the-week. Once again, we compare the
failure prediction of this PHM with logistic regression as a secondary tool. We use the same
7 variables and apply the logistic regression model to the same data set. The results of the
two models are compared and we obtain a Pearson Correlation of Coefficient of 0.724,
significant (two-tailed) at 0.000. Details of the logistic regression model appear in Table B2:
Variable Represented Parameter Sig.
Social -0.015 0.051
Analytical -0.010 0.015
X1 Afternoon Shift or not 0.589 0.028
X2 Night Shift or not 0.791 0.003
Y1 Driven machine or not -0.462 0.053
Y2 Drive machine or not -1.031 0.000
V1 1st day of the week or not 0.573 0.009
Constant - -1.906 0.001
Table B2: Summary of Estimated Parameters
The correlation coefficient implies that about 50% of the variance of one model can be
explained by the other. Field (2005) states that Cohen (1988, 1992) has made “some widely
accepted suggestions about what constitutes a large or small effect” and proceeds to state
r = 0.5 to be a large effect. Therefore, in our case, where r = 0.724, we have a large effect,
APPENDICES
179
indicating a strong correlation between the two approaches. This in turn serves us in our
original purpose of using one approach to validate the other.
APPENDICES
180
APPENDIX C: Discussion on Obtaining the PHM
We provide further details to the model building process discussed in Section 3.6.7.2.
We divide all variables into four groups: (1) Experience and all pair wise interactions
involving it, (2) Social and all pair wise interactions involving it, (3) Analytical and all pair
wise interactions involving it, and (4) all binary variables, representing shifts, parts of the
week, and machines, along with the meaningful interaction terms.
Group 1, the Experience group:
We start out with all 7 variables in the model. The result is shown in Table C1:
Parameter Estimate p-Value
Scale 9.947 -
Shape 1.045 0.5277
Experience -0.0039 0.8519
ExpSoc -0.0012 0
ExpAna 0.00006 0.6735
ExpX1 0.0333 0
ExpX2 0.0340 0
ExpV1 0.0143 0.0002
ExpV2 -0.0092 0.1230
Table C1: PHM parameter estimation, using all variables related to operator experience level
There are three variables with p-values higher than 0.1. We shall consider the paths leading
from eliminating each one. We start with eliminating Experience, the variable with the
highest p-value. The result is shown in Table C2:
APPENDICES
181
Parameter Estimate p-Value
Scale 10.72 -
Shape 1.045 0.5293
ExpSoc -0.0012 0
ExpAna 0.00004 0.703
ExpX1 0.0331 0
ExpX2 0.0336 0
ExpV1 0.0143 0.0002
ExpV2 -0.0092 0.1212
Table C2: PHM parameter estimation continued, variable “Experience” is eliminated
We have two variables with p-values larger than 0.1. We shall consider both paths. We first
eliminate the Experience-Analytical interaction variable. The following is the result (Table
C3):
Parameter Estimate p-Value
Scale 12.04 -
Shape 1.044 0.5417
ExpSoc -0.0012 0
ExpX1 0.0335 0
ExpX2 0.0340 0
ExpV1 0.0144 0.0001
ExpV2 -0.0093 0.1187
Table C3: PHM parameter estimation continued, variable “ExpAna” is eliminated
We are left with only one p-value larger than 0.1 and eliminating it results in Table C4. This
model now consists entirely of variables with p-values significant at the 5% level.
APPENDICES
182
Parameter Estimate p-Value
Scale 12.02 -
Shape 1.034 0.6371
ExpSoc -0.0012 0
ExpX1 0.0327 0
ExpX2 0.0333 0
ExpV1 0.0154 0
Table C4: PHM parameter estimation, final model for the “Experience” group
We now go back to the model with all 7 variables (Table C1) and, this time, eliminate the
interaction term Experience-Analytical. This variable does not have the highest p-value, but
it is still larger than 0.1 and eliminating it first may result in a different end-model. The
result is shown in Table C5:
Parameter Estimate p-Value
Scale 12.15 -
Shape 1.044 0.5411
ExpSoc -0.0012 0
ExpX1 0.0334 0
ExpX2 0.0340 0
ExpV1 0.0144 0.0001
ExpV2 -0.0093 0.1181
Experience 0.0009 0.9598
Table C5: PHM parameter estimation continued from Table C1, variable “ExpAna” is eliminated
Next, we can eliminate Experience, but that would result in Table C3, which we have
already obtained. Therefore, we eliminate the interaction term of Experience and V2, the
variable representing the last day of the week. This results in Table C6:
APPENDICES
183
Parameter Estimate p-Value
Scale 12.07 -
Shape 1.034 0.6372
ExpSoc -0.0012 0
ExpX1 0.0327 0
ExpX2 0.0333 0
ExpV1 0.0154 0
Experience 0.0004 0.9817
Table C6: PHM parameter estimation continued, variable “Experience” is eliminated
The only p-value larger than 0.1 is Experience and eliminating it gets us back to the model
achieved above in Table C4.
Once again, we go back to the model represented in Table C1 and this time, the first
variable we eliminate is Experience-V2. The result is shown in Table C7:
Parameter Estimate p-Value
Scale 9.509 -
Shape 1.036 0.6179
ExpSoc -0.0012 0
ExpX1 0.0326 0
ExpX2 0.0330 0
ExpV1 0.0153 0
Experience -0.0053 0.8005
ExpAna 0.00007 0.6172
Table C7: PHM parameter estimation continued from Table C1, variable “ExpV2” is eliminated
We have two choices of variables to eliminate. We first eliminate Experience and Table C8
shows the result:
APPENDICES
184
Parameter Estimate p-Value
Scale 10.52 -
Shape 1.036 0.6206
ExpSoc -0.0012 0
ExpX1 0.0324 0
ExpX2 0.0329 0
ExpAna 0.00004 0.6655
Table C8: PHM parameter estimation continued, variable “Experience” is eliminated
We can see that Experience-Analytical is the only variable with a p-value larger than 0.1.
Eliminating it results in Table C4 which we have already obtained.
We go back to Table C7 and this time, we eliminate Experience-Analytical first. But this
would result in an identical Table to Table C6, eventually leading back to the model
obtained and shown in Table C4.
Group 2, the Social group:
We followed the same procedure as described for Group 1, to come up with all variables
related to the Social Interaction score of the operator. The result is shown in Table C9,
where all p-values are significant.
Parameter Estimate p-Value
Scale 3.496 -
Shape 1.038 0.5909
Social -0.0422 0.0059
SocAna 0.0003 0.0305
SocX1 0.0340 0
SocX2 0.0315 0
SocV1 0.0124 0.0022
ExpSoc -0.0011 0
Table C9: PHM parameter estimation, final model for the “Social” group
APPENDICES
185
Group 3, the Analytical group:
This is similar to our description for group 2. All variables related to the Analytical skill are
grouped together and the final model we obtain is shown in Table C10:
Parameter Estimate p-Value
Scale 96.69 -
Shape 1.008 0.9157
Analytical 0.0586 0.0242
ExpAna -0.0008 0
AnaX1 0.0361 0
AnaX2 0.0349 0
AnaV1 0.0141 0.0011
SocAna -0.0007 0.0138
Table C10: PHM parameter estimation, final model for the “Analytical” group
Group 4, the group with all binary terms:
In group 4, there are 6 main effects considered. These six consist of two variables, X1 and X2,
representing the three shifts; two variables, V1 and V2, representing the three parts of the
week; and two variables, Y1 and Y2, representing the three machines under analysis. The
pair wise interactions between shift variables and parts of the week are considered. The
effect of the night shift on the first day of the week may be different than the first dayshift
of the week. But interactions with machines do not make intuitive sense and are not
considered. A similar procedure as described for Group 1 is followed and the final model we
obtain is shown in Table C11:
APPENDICES
186
Parameter Estimate p-Value
Scale 343.6 -
Shape 1.126 0.1249
X1 1.906 0.0004
Y1 -0.2874 0.1669
Y2 -1.076 0
V1 -3.986 0
V2 2.374 0.0028
X1V1 6.145 0
X2V1 9.127 0
X1V2 -6.429 0
X2V2 -4.06 0.0041
Table C11: PHM parameter estimation, final model for the “binary terms” group
Final model:
The variables included in the final models from the four groups, represented by Tables C4,
C9, C10, and C11, are all put together to start the model building for a final model. A similar
procedure as described for Group 1 is followed where all possible paths leading from
variables with p-values larger than 0.1 are considered. The final mode achieved is shown in
Table C12:
APPENDICES
187
Parameter Estimate p-Value
Scale 129.2 -
Shape 1.107 0.1839
X1 3.291 0
Y1 -0.7177 0.0068
Y2 -1.375 0
V2 3.328 0.0001
X1V1 5.397 0
X2V1 8.257 0
X1V2 -7.543 0
X2V2 -4.876 0.0008
ExpSoc -0.0052 0
SocAna 0.0044 0
SocV1 -0.0555 0
Analytical -0.2805 0
Experience 0.2954 0
Table C12: PHM parameter estimation, final model, with all four groups combined
This model has 13 variables and EXAKT calculates a maximum likelihood estimator of 618.85
for it. Based on these figures, the AIC is calculated as follows:
,
where
: maximum likelihood estimator for the model
: number of variables in the model
In our case, . This value can be compared to AIC
values models we obtain from other approaches, in order for us to choose a final model for
our analysis.
APPENDICES
188
APPENDIX D: Goodness of Fit of Regression Models for Predicting Output
In Section 4.4.1, we present a set of regression equations for our case study. This appendix
discusses our statistical analysis for validating the model fit of the regression equations.
To check the validity of the regression models, we perform two steps. First, we look at
the R2 value as an indicator to assess the model fit. Values for the Driven, Drive and Ring
lines are 0.740, 0.449, and 0.564, respectively. Factors, such as linearity of residuals,
homoscedasticity, independence of errors, and the influence of outliers are assessed for the
goodness of model fit. We analyze linearity and homoscedasticity visually using graphs
which include plotting standardized residuals against standardized predicted values. A
sample of such a plot for the Driven line is presented in Figure D1. This plot shows us what
we expect from a good model fit which is having the points centered around zero (on the y-
axis); no particular non-linear relationship between the outcome and the predictor; and no
change in variance, or spread of the points in the vertical direction, along the x-axis. None
of the graphs for any of the three product lines indicates non-linearity or heteroscedasticity
For each product line, we also look at normal p-p plots of normally distributed residuals.
A sample of such plot for the Driven line appears in Figure D2. This graph is in compliance
with normality assumptions as all data points are aligned with the straight line joining (0,0)
and (1,1). In terms of indicators of multicollinearity, the Variance Inflation Factor (VIF)
values appearing in Table D1 indicate that there are no problems. The largest VIF is smaller
than 10, and the average VIF is not substantially greater than 1 (Bowerman and O’Connell,
1990). We use the Durbin-Watson indicator to test for the assumption of independent
errors. The values for Driven, Drive, and Ring models are all close to 2, which given our
APPENDICES
189
sample size and the number of predictors, indicates compliance with the assumption.
Finally, we use Cook’s distance to measure the overall influence of individual cases on the
model. The maximum Cook’s distances for the three models are all significantly less than 1,
indicating there are no influential observations. All the aforementioned checks indicate a
good model fit of our regression models.
The second step taken is to use data-splitting to cross-validate the model. We randomly
select 80% of the data set and develop another regression model which includes the same
predictor variables but with variable coefficients estimated from the reduced data set. This
second model is then applied to the remaining 20% of the data and the production volumes
are predicted. This predicted set is compared to the actual production volumes of the
remaining 20% of the data set. The two sets are compared to calculate a Pearson’s
Correlation Coefficient; for the Driven, Drive, and Ring lines, the correlation coefficients are
0.879, 0.680, and 0.773 respectively. We go one step further in the data splitting for
additional validation. For the Driven line, we proceed to do a 70-30 and a 60-40 data split.
The Pearson’s correlation coefficient is still high at 0.874 and 0.850 for the 70-30 and 60-40
respectively. The data set is randomized; the parts of the randomized data used for the 70-
30 and the 60-40 data split are different. For the 70-30 split, we take the first 651 rows of
the original 930 rows to build our model. We then apply this model to the remaining 279
rows, and compare the set of the predicted values to the actual output values. For the 60-
40 split, we take the last 558 rows of the original 930 rows to build our model and apply our
model to the remaining 372 rows.
APPENDICES
190
Figure D1: Scatter Plot of Standardized Residuals for checking assumption of
random errors and homoscedasticity
Note on Figure 2: The reason for the “discrete” look of the residuals is that skill scores
are for three individuals, assessed three times in the nine-month duration of the study.
Each individual’s skill score remains constant for the quarter.
Figure D2: Normal P-P Plot of Standardized Residuals for checking
assumption of normality
Y-axis: Standardized
differences
between observed
data and values
predicted by model
X-axis: standardized
forms of values
predicted by model
APPENDICES
191
Line Largest VIF Average VIF Durbin-Watson Cook’s Distance
Driven gear 1.546 1.55 2.018 0.011
Drive gear 3.51 3.51 2.064 0.032
Ring gear 1.359 1.359 2.008 0.036
Table D1: Indicators of Multicollinearity, Independence of Errors, and Influential Cases
APPENDICES
192
APPENDIX E: Model Properties
We present some properties of the recursive formula developed in Section 5.2 that can
reduce the calculations required over the planning horizon.
Property 1: At each stage , where , probability of any state with and
can be calculated as follows:
0, ,0 | , ,0 | , 1,2,...mp i i n i mD q p i i n i m .
In stage , indicates the machine must be operational in the last i stages,
and have been non-operational at all other previous ones. For this scenario, the machine
must fail immediately after starting at , with an initial failure probability , goes
through the repair period, and fails immediately again. The failures are repeated m times in
total before the machine becomes operational for i stages (Figure E1). Failure cannot occur
during the last i intervals, otherwise, a cannot equal i. The probability that the machine is
operational in the last i stages is .
1a: stage , with m initial failures and repair periods, 1b: stage , with i
followed by i operational intervals operational intervals, no failures
Figure E1: Relationship between and
Property 2: At any stage , the probability of any state with and can be derived
as follows:
.
0 1 2 i n n
mD mD+1 mD+i
D D D
APPENDICES
193
Since , the machine is in a repair period at stage . This indicates that in the
previous stage, , , since one less time interval had elapsed on repairing the
machine. This relationship between “ ” and “ ” will go on over all previous stages
until . In this case, , leading to .
Once , there are multiple paths possible from the previous stage. In the
previous stage, the machine may be operational and fail just before , or it may fail
immediately after the completion of a repair period. Therefore, the recursive formula has to
be used for .
Property 3: At any stage , probability of any state with is calculated as:
.
The machine has been running for the last a stages; however, since , there must be
at least one previous failure. Given the total number of intervals n, the number of
operational intervals must be i-a at the point before the occurrence of the last failure
(Figure E2). The value of d must also be zero whether the failure occurred after an
operational period or the completion of a repair. The value of a, however, is bounded
between zero and since . The probability of failure in this scenario is affected
by the expertise gained by the operator, having worked intervals.
Figure E2: Depiction of a possible history of state where
D
APPENDICES
194
APPENDIX F: Model and Property Generalization with Initial Conditions
Similar to Appendix E, in this Appendix we present some properties of the recursive formula
developed in Section 5.2. The work presented here, however, assumes a set of initial
conditions for the machine.
If we consider initial conditions at to be and , then all probabilities will be
conditional on these initial conditions. As a result, the recursive formula expressed in
Section 5.2 becomes:
0 0
0 0 00
0 0 0 00 0
( ,0, 1 1, , ), 0 0 1
( , ) ( , ,0 1, , ), 0 1 1
( 1, 1) ( 1, 1,0 1, , ),( , , , , )
i
b
p i d n i a a d D
q i i b p i b n i a a d D i n N
q i n a n p n n n i a ap i a d n i a
0 0 0
0 0 0 0
0 1 1
0, 0 1
1 ( 1, 1) ( 1, 1,0 1, , ), 0 0
1 ( 1, 1) ( 1, 1,0 1, , ), 0
d D i n N
a d D i n
q i i a p i a n i a a n d
q i i a a p i a n i a a n d
The first line represents the repair scenario. The next three lines represent the special cases
for the failure scenario. The last two lines represent the survival scenario.
Similar to Appendix E, we present properties, conditional on the initial conditions
and , that can aid us in calculating the state probabilities in future stages without the
need to use the one step recursive function.
Property 1:
.
Based on , we can conclude that the machine has been operational in every time
interval since the start. If the planning horizon is started with operator expertise at and a
machine age of , then at each operational interval, one time unit is added to these two
APPENDICES
195
parameters. The probability of survival for the first interval is , for the second is
, and so on, until the last probability of survival
.
Property 2:
At each stage , , probability of any state with and can be
calculated as follows:
0 0 0 0 0 0
-1, ,0 | , , ( , ) ( ,0) , ,0 | , ,0 , 1,2,...
mp i i n i mD i a q i a q i p i i n i i m .
We consider a stage where . In this stage, indicates the machine must be
operational in the last i stages, and have been non-operational at all other previous ones.
For this scenario, the machine fails immediately after starting at , with an initial
failure probability , goes through the repair period, and fails immediately again,
with probability . The subsequent failures are repeated m-1 times before the
machine becomes operational for i stages (Figure F1). Failure cannot occur during the last i
intervals, otherwise, a cannot equal i. The probability that the machine is operational in the
last i stages is because at the start of this interval, expertise still remains
at but working age is reset to zero.
1a: stage , with m initial failures and repair periods, 1b: stage , with i
followed by i operational intervals operational intervals, no failures
Figure F1: Relationship between and
n 0 1 2 i
D D D
0 1 2 i n
APPENDICES
196
Property 3: At any stage , the probability of any state with can be simply
derived as follows:
.
This property follows directly from the repair scenario of the recursive function. Once
, we are no longer following from just one state from the previous stage. In the
previous stage, the machine may be operational and fail just before , or it may fail
immediately after the completion of a repair period. Therefore, the recursive formula has to
be used for .
Property 4: At any stage , the probability of any state with can be
calculated as follows:
The machine has been running for the last a stages; however, since , there must be
at least one previous failure. Given the total number of intervals n, the number of
operational intervals must be i-a at the point before the occurrence of the last failure
(Figure F2). The value of d must also be zero whether the failure occurred after an
operational period or the completion of a repair. The value of a, however, is bounded
between zero and since . The probability of failure in this scenario is affected
by the expertise gained by the operator, having worked intervals. The second line
covers the case where exactly one failure occurred at the end of the initial intervals.
All other scenarios are covered by the first line.
APPENDICES
197
Figure F2: A possible depiction of history of state where
Effect of Initial Conditions on Expected Output
In Section 5.2, the expected output is expressed using the assumption that and
, when . In the presence of some previous level of operator learning,
regardless of machine working age, the output formula can be expressed as follows:
, where
For expertise level i, we have:
. If we define , then the expected
total output for the entire horizon is as follows:
.
D
APPENDICES
198
APPENDIX G: Data Set Used for Empirical Studies
We present a sample of the data set that we have used to perform the case study analyses
in Chapters 3, 4, and 5. The data set has been analyzed using multiple tools, approaches,
and software programs.
We will describe the various analyses performed at different stages of this
dissertation’s Chapters:
There is production data on the four machines: Driven, Drive, Pinion, and Ring. We also
have the personnel record of which operators were on shift at each time and date. We have
the expert assessment reports of the operators’ skills. The production data for the Pinion
machine is incomplete and is therefore ignored. The three remaining machines have a
combined total of 3,049 records, two records per machine for the 9-month duration. The
records include operational periods as well as failures, and there are 119 failures and 11
suspensions, making a total “event” sample size of 130. The assessment data on the 12
operators are complete for the nine months of the assessment period, which was repeated
quarterly.
The production data set described above is used in various ways throughout this
dissertation. Summary of the usages are provided in Table G1. Some of the analysis covers
certain parts only; others consider the entire set. In addition, we use different tools and
approaches to analyze the data. We start the analysis in Chapter 3 by considering the Ring
machine only and the three operators working on it. We use a PHM as a tool to calculate
the hazard rate, considering machine age and a few HR factors. We then seek to validate
the PHM by using another tool, logistic regression, on the same segment of the data set.
APPENDICES
199
The data specific to the Ring machine, as well as the three operators assigned to it, is
analyzed by logistic regression to estimate the probability of machine failure. In the latter
part of Chapter 3, we analyze the entire data set using a PHM again. The three machines are
combined, along with the nine operators who are assigned to these three machines on the
three shifts. Indicator variables are used to distinguish amongst the machines and an
additional variable, day-of-the-week, is considered. This variable was not originally a part of
the data set but could easily be added afterwards based on the date field that was a part of
the production data. The PH modeling is done using the software EXAKT; we set up the data
set accordingly. An example of this data set analyzed by EXAKT is shown in Table G2; the full
data set is presented on a CD in the back cover of the dissertation document.
Similar to the early part of Chapter 3’s empirical study analysis, the data set is once again
analyzed on a per-machine basis in Chapter 4. The entire nine months for the Driven, Drive,
and Ring gear machines are used. Chapter 3 focuses on the HR factors leading to machine
failures whereas Chapter 4 focuses on HR factors resulting in the quantity of production
output. The perspective we take in Chapter 3 is that a lower level of skill is likely to lead to a
higher hazard rate; the perspective in Chapter 4 is that a lower level of skill results in a
lower production output. We do not consider the machine failures the operators cause;
only how many produce per unit time when the machine is operational.
The tool we use is multiple regression where we predict the production output based on
HR factors. We set up the data set slightly differently and use the SPSS software to perform
the multiple regression analysis; an example is shown in Table G3. Since we do not assume
APPENDICES
200
the skill levels to be static, we use learning curves to represent the dynamic nature of the
skill components over the planning horizon. The learning curve equations are obtained from
the multiple expert assessments for each operator.
To validate the regression equations we obtain, we split the randomized data in various
ways and compare model forecasts of the production output with actual output produced
by the system. This is described in detail in Appendix D. In addition, we use the data set to
obtain a learning curve for each skill component for every operator.
The analysis of the data set in Chapter 5 is similar to the latter part of Chapter 3 where
the data set is considered in its entirety and the additional factor of day-of-the-week is
added. The final PHM obtained in Chapter 3, using the entire data set for all three
machines, is used to calculate the probability of failure at each time interval. The learning
curves obtained in chapter 4 are used to supply the PHM with the appropriate factor value.
The regression equations in Chapter 4 are used to forecast the production output once we
calculate the expected number of operational intervals. But the overall framework of
Chapter 5 is a Markov chain analysis. We discretize the planning horizon and use the PHM
and learning curves at each time interval. At the end of the planning horizon, we use the
regression equations to calculate the expected production output over the entire planning
horizon.
In addition to using the data set for developing the various PH models presented in
Chapter 3, we also use it for model validation. This is described in Section 5.3.1, we develop
a PHM based on the data set for the period January 1 to August 15, use the obtained model
APPENDICES
201
to forecast the risk for the period August 18 to October 9, and compare the forecast to
actual results from the data set.
Our usage of the data set as described above is summarized in Table G1:
Chapter/Section Specific usage Statistical Model Approach/Framework
3.6.3 – 3.6.6 - Per-machine basis
(Ring machine only)
- PHM
- Logistic regression (Appendix)
- Analyzing risk of failure
3.6.7 - Entire data set - PHM
- Logistic regression (Appendix)
- Analyzing risk of failure
4 - Per-machine basis - Multiple regression
- Learning curves
- Production output
forecasting
5 - Entire data set - PHM
- Multiple regression
- Learning curves
- Markov chain
Table G1: Usage of data set in the empirical works presented throughout the dissertation
APPENDICES
202
Data Set sample: EXAKT usage for analysis in Chapter 3:
Machine Working Age Experience Social Analytical X1 X2 Y1 Y2 V1 V2
Driven 0 42.86 70.83 27.12 0 1 1 0 1 0
Driven 8 100.00 77.08 78.67 0 0 1 0 1 0
Driven 16 42.86 41.15 23.05 1 0 1 0 1 0
Driven 24 42.86 70.83 27.12 0 1 1 0 0 0
Driven 32 100.00 77.08 78.67 0 0 1 0 0 0
Driven 40 42.86 41.15 23.05 1 0 1 0 0 0
Driven 48 42.86 70.83 27.12 0 1 1 0 0 0
Driven 56 100.00 77.08 78.67 0 0 1 0 0 0
Driven 64 42.86 41.15 23.05 1 0 1 0 0 0
Driven 72 100.00 77.08 78.67 0 1 1 0 1 0
Driven 80 42.86 41.15 23.05 0 0 1 0 1 0
Driven 88 42.86 70.83 27.12 1 0 1 0 1 0
Driven 96 100.00 77.08 78.67 0 1 1 0 0 0
Driven 104 42.86 41.15 23.05 0 1 1 0 0 0
Driven 112 42.86 70.83 27.12 1 0 1 0 0 0
Driven 120 100.00 77.08 78.67 0 1 1 0 0 0
Driven 128 42.86 41.15 23.05 0 0 1 0 0 0
Driven 136 42.86 70.83 27.12 1 0 1 0 0 0
Driven 144 100.00 77.08 78.67 0 1 1 0 0 0
Driven 152 42.86 41.15 23.05 0 0 1 0 0 0
Driven 160 42.86 70.83 27.12 1 0 1 0 0 0
Driven 168 100.00 77.08 78.67 0 1 1 0 0 1
Driven 176 42.86 41.15 23.05 0 0 1 0 0 1
Driven 184 42.86 70.83 27.12 1 0 1 0 0 1
Driven 192 42.86 41.15 23.05 0 1 1 0 1 0
Driven 7.5 42.86 70.83 27.12 0 0 1 0 1 0
Driven 15.5 100.00 77.08 78.67 1 0 1 0 1 0
Driven 23.5 42.86 41.15 23.05 0 1 1 0 0 0
Driven 31.5 42.86 70.83 27.12 0 0 1 0 0 0
Driven 39.5 100.00 77.08 78.67 1 0 1 0 0 0
Driven 47.5 42.86 41.15 23.05 0 1 1 0 0 0
Driven 55.5 42.86 70.83 27.12 0 0 1 0 0 0
Driven 63.5 100.00 77.08 78.67 1 0 1 0 0 0
Driven 71.5 42.86 41.15 23.05 0 1 1 0 0 0
Table G2: Data set sample for Chapter 3 analysis, using the EXAKT software
Description of indicator variables: X1, X2: representing three shifts; Y1, Y2: representing three machines; V1,
V2: representing start, middle and end of the week.
APPENDICES
203
Data Set sample: SPSS usage for analysis in Chapter 4:
Machine Shift Production Experience Social Analytical X1 X2
DRVN 311 100.00 77.08 78.67 0 1
DRVN 185 42.86 70.83 27.12 1 0
DRVN 286 100.00 77.08 78.67 0 1
DRVN 175 42.86 41.15 23.05 0 0
DRVN 240 57.14 42.19 35.22 0 0
DRVN 307 100.00 75.00 92.33 0 1
DRVN 215 57.14 42.19 35.22 0 0
DRVN 242 57.14 67.71 45.94 1 0
DRVN 258 57.14 67.71 45.94 1 0
DRVN 307 100.00 75.00 92.33 0 1
DRVN 389 100.00 67.71 100.00 0 1
DRVN 278 57.14 40.10 48.87 0 0
DRVN 327 71.43 70.83 63.64 1 0
DRVN 347 71.43 70.83 63.64 0 0
DRVN 319 100.00 67.71 100.00 0 0
DRVN 261 57.14 40.10 48.87 1 0
DRVN 326 100.00 67.71 100.00 0 1
DRVN 265 57.14 40.10 48.87 1 0
DRVN 297 71.43 70.83 63.64 0 1
DRVN 248 42.86 41.15 23.05 1 0
DRVN 282 100.00 77.08 78.67 0 0
DRVN 312 100.00 77.08 78.67 0 0
DRVN 167 42.86 41.15 23.05 1 0
DRVN 229 42.86 70.83 27.12 0 1
DRVN 191 42.86 70.83 27.12 0 1
DRVN 318 100.00 75.00 92.33 0 1
DRVN 216 57.14 42.19 35.22 0 0
DRVN 231 57.14 67.71 45.94 1 0
DRVN 233 57.14 67.71 45.94 1 0
DRVN 204 57.14 42.19 35.22 0 0
DRVN 295 100.00 75.00 92.33 0 1
DRVN 213 57.14 42.19 35.22 0 0
DRVN 243 57.14 42.19 35.22 0 0
DRVN 340 100.00 67.71 100.00 0 0
DRVN 243 57.14 40.10 48.87 0 1
DRVN 296 71.43 70.83 63.64 0 0
DRVN 357 100.00 67.71 100.00 1 0
Table G3: Data set sample for Chapter 4 analysis, using the SPSS software