219
System Performance Analysis Considering Human-related Factors by Ashkan Corey Kiassat A thesis submitted in conformity with requirements for the degree of Doctor of Philosophy Graduate Department of Mechanical and Industrial Engineering University of Toronto © Copyright by Ashkan Corey Kiassat, 2013

System Performance Analysis Considering Human-related Factors

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: System Performance Analysis Considering Human-related Factors

System Performance Analysis Considering Human-related Factors

by

Ashkan Corey Kiassat

A thesis submitted in conformity with requirements for the degree of Doctor of Philosophy

Graduate Department of Mechanical and Industrial Engineering

University of Toronto

© Copyright by Ashkan Corey Kiassat, 2013

Page 2: System Performance Analysis Considering Human-related Factors

ii

System Performance Analysis Considering Human-related Factors

Ashkan Corey Kiassat

Doctor of Philosophy

Department of Mechanical and Industrial Engineering

University of Toronto

2013

ABSTRACT

All individuals are unique in their characteristics. As such, their positive and negative

contributions to system performance differ. In any system that is not fully automated, the

effect of the human participants has to be considered when one is interested in the

performance optimization of the system. Humans are intelligent, adaptive, and learn over

time. At the same time, humans are error-prone. Therefore, in situations where human and

hardware have to interact and complement each other, the system faces advantages and

disadvantages from the role the humans play. It is this role and its effect on performance

that is the focus of this dissertation.

When analyzing the role of people, one can focus on providing resources to enable the

human participants to produce more. Alternatively, one can strive to ensure the occurrence

of less frequent and impactful errors. The focus of the analysis in this dissertation is the

latter.

Our analysis can be categorized into two parts. In the first part of our analysis, we

consider a short term planning horizon and focus directly on failure risk analysis. What can

Page 3: System Performance Analysis Considering Human-related Factors

iii

be done about the risk stemming from the human participant? Any proactive steps that can

be taken will have the advantage of reducing risk, but will also have a cost associated with

it. We develop a cost-benefit analysis to enable a decision-maker to choose the optimal

course of action for revenue maximization. We proceed to use this model to calculate the

minimum acceptable level of risk, and the associated skill level, to ensure system

profitability. The models developed are applied to a case study that comes from a

manufacturing company in Ontario, Canada.

In the second part of our analysis, we consider a longer planning horizon and are

focused on output maximization. Human learning, and its effect on output, is considered. In

the first model we develop, we use learning curves and production forecasting models to

optimally assign operators, in order to maximize system output. In the second model we

develop, we perform a failure risk analysis in combination with learning curves, to forecast

the total production of operators. Similar to the first part of our analysis, we apply the

output maximization models to the aforementioned case study to better demonstrate the

concepts.

Page 4: System Performance Analysis Considering Human-related Factors

iv

To the one who inspired me to embark on this journey

Page 5: System Performance Analysis Considering Human-related Factors

v

ACKNOWLEDGEMENTS

The past four years have been a fantastic journey and certainly one of the best periods of

my life. Having been away from school for 11 years, there were many difficulties along the

way, academic and otherwise. But getting to the end is even sweeter because of all the

challenges. I made the decision to start this journey with much hesitation and uncertainty; I

look back and consider my decision to be one of the best things I have done in my life.

Professor Andrew K. S. Jardine, you encouraged me to join the lab and to pursue this

dream, right from the very first email I got from you. Thank you for having me at C-MORE

and the support you have provided along the way. I have very much enjoyed the freedom

you granted me along the way to pursue my own research and teaching interests.

Dr. Dragan Banjevic, your extraordinary support throughout the last four years needs an

enormous acknowledgement. You are amazingly sharp in many subject matters and this,

combined with your good heart and your great willingness to help, makes for the best

support a PhD student can ask for. Your assistance has been absolutely essential along the

way. Thank you for always being there.

My great friend, Dr. Nima Safaei, you are brilliant; thank you for sharing your knowledge

with me. You have guided me and helped me mature in performing research and writing

papers. We have become great friends over the last four years and I value this friendship

very much. I wish for this friendship and our research collaborations to continue for years to

come.

Page 6: System Performance Analysis Considering Human-related Factors

vi

Professor Mark Chignell, I value your wisdom, your big-picture thinking, and the fact that

you have always challenged me to be a better scientist. You have always made time for me,

despite your extremely busy schedule. Your great support has been beyond just serving on

my committee. Thank you for the consulting opportunities and your continuous support in

my search for an academic position. Your kindness is much appreciated.

Professor Ann Armstrong, you have always made sure my research remains people-centric.

In the end, I want the human resources of an organization to make a positive contribution

to the success of the organization. You taught me to think of the people in a positive light,

to ensure my work contributes to the nurturing of the human resources, rather than

thinking of them as a nuisance and a source of error. Thank you for providing this alternate

mindset. You also provided me with my first teaching opportunity. Thank you for the trust.

You helped me find my calling in life.

Dr. Bob Platfoot, my external examiner, thank you for taking an in-depth look at my thesis

and providing great feedback. Your positive criticism and comments to reinforce my

findings and assumptions have strengthened my work.

Professor Roy Kwon, thank you for participating in the oral examination procedure. You

came in at the late stages and made time to become familiar with my work. Your comments

are much appreciated.

To my friends at C-MORE, in particular my new sisters Janet and Lorna, thank you for

helping me keep my sanity. I very much value our friendships and will always remember

your support and acts of kindness.

Page 7: System Performance Analysis Considering Human-related Factors

vii

To my mom, dad, and brother, I feel blessed to have you. Thank you for your support; I am

glad to celebrate this achievement with you and to see your pride. My spirit seeks

challenges and I always aim for the top. I have these values because of you.

To my courageous aunt, Shahraz, so much of what I am and what I have is because of you.

You have always been there for me, no matter what. Thank you for all your kindness

throughout my entire life.

To Nirvana, eshgham, now and always, thank you for being you, thank you for pushing me

to do this, to make the best of myself. You were the inspiration that got me going on this

path. Having found my calling in life, and the subsequent happiness for years to come, is

because of you.

Page 8: System Performance Analysis Considering Human-related Factors

viii

PREFACE

The following papers have resulted from the work discussed in this dissertation:

1. Kiassat, C., Safaei, N., (January 2012). Optimizing Maintenance Policies When Human-

Related Factors Are Included in the Proportional Hazards Model. European Journal of

Operational Research, second review.

2. Kiassat, C., Safaei, N., Banjevic, D. (September 2012). Effects of Operator Learning on

Production Output: A Markov Chain Approach. Journal of the Operational Research

Society, second review.

3. Kiassat, C., Konstandinidou, M., (May 2010). Recognizing Significant Human-Related

Factors Affecting System Reliability. Production & Operation Management Society

(POMS) conference, Vancouver, BC, Canada. Abstract # 015-0109.

4. Centrone, D., Kiassat, C., Garetti, M., Banjevic, D., Jardine, A., (May 2010). Proportional

Hazard Model: A Valuable Methodology for Sustainable Manufacturing. Proceedings of

Maintenance for Sustainable Manufacturing (M4SM) conference, Verona, Italy. 51-56.

5. Kiassat, C., Safaei, N., (September 2009). Integrating Human Reliability Analysis into a

Comprehensive Maintenance Optimization Strategy. Proceedings of World Congress on

Engineering Asset Management (WCEAM), Athens, Greece. 561-566.

6. Kiassat, C., Banjevic, D., Safaei, N. Using the Profitability Threshold Calculated from a

Risk of Failure Model to Determine the Minimum Level of Human-related Factors.

(pending submission, extension of paper under second review at European Journal of

Operational Research).

Page 9: System Performance Analysis Considering Human-related Factors

ix

Contents ABSTRACT ............................................................................................................................................... ii

ACKNOWLEDGEMENTS .......................................................................................................................... v

PREFACE .............................................................................................................................................. viii

LIST OF TABLES .................................................................................................................................. xi

LIST OF FIGURES ................................................................................................................................... xiv

ABBREVIATIONS ................................................................................................................................... xvi

1. INTRODUCTION .................................................................................................................................. 1

2. QUANTIFICATION OF HUMAN-RELATED FACTORS ............................................................................ 8

2.1. Literature review on quantification techniques .......................................................................... 8

2.2. Usage of Critical Incident Technique ......................................................................................... 15

2.2.1. Selection of Experts ............................................................................................................ 18

2.2.2 Data Collection Tools........................................................................................................... 20

2.3. Concluding remarks ................................................................................................................... 22

3. FAILURE RISK ANALYSIS .................................................................................................................... 23

3.1. Introduction .............................................................................................................................. 24

3.2. Literature Review ...................................................................................................................... 26

3.3. Evaluating Intervention Methods for Human-Related Risk ...................................................... 28

3.4. Developing an Evaluation Model for Intervention Methods .................................................... 30

3.5. Determining the minimum skill level ........................................................................................ 35

3.6. Failure Risk Analysis – An Empirical Study ................................................................................ 41

3.6.1. Quantification of skill and shift work ................................................................................. 46

3.6.2. Analyzing the risk of failure ................................................................................................ 52

3.6.3. Odds estimates ................................................................................................................... 60

3.6.4. Evaluation model for intervention methods ...................................................................... 62

3.6.5. Expanded data set, additional factors ................................................................................ 74

3.6.5.1. Discussion on the model’s terms ................................................................................ 77

3.6.5.2. Procedure for developing the model .......................................................................... 79

3.6.5.3. A more parsimonious PHM ......................................................................................... 82

3.6.5.4. Revenue Function and Discussion ............................................................................... 84

3.7. Concluding Remarks and Future Work ..................................................................................... 89

4. OPTIMAL OPERATOR ASSIGNMENT ................................................................................................. 92

Page 10: System Performance Analysis Considering Human-related Factors

x

4.1. Literature review ....................................................................................................................... 94

4.2. Model Development ................................................................................................................. 96

4.3. Optimization Model ................................................................................................................ 100

4.4. Empirical Study ........................................................................................................................ 101

4.4.1. Predicting Output in Terms of Human-related Factors .................................................... 102

4.4.2. Learning Curves ................................................................................................................ 103

4.4.3. Revenue Model Using Regression Equations and Learning Curves ................................. 107

4.4.4. Optimization Model and Discussion ................................................................................ 108

4.4.5. Optimal Solution Compared to Solutions of Other Methods .......................................... 108

4.4.6. Sensitivity Analysis ........................................................................................................... 110

4.5. Concluding Remarks and Future work .................................................................................... 116

5. EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT ..................................................... 119

5.1. Literature Review .................................................................................................................... 122

5.2. Markov Chain Approach .......................................................................................................... 124

5.3. Empirical Study ........................................................................................................................ 131

5.3.1. Model Validation .............................................................................................................. 137

5.4. Concluding Remarks and Future Work ................................................................................... 141

6. CONCLUSION .................................................................................................................................. 145

6.1. Central Ideas and Contributions ............................................................................................. 145

6.2. Concluding Remarks and Recap .............................................................................................. 148

REFERENCES ....................................................................................................................................... 152

APPENDIX A1: Observational Study ................................................................................................... 167

APPENDIX A2: System Experts Assessing Operators .......................................................................... 170

APPENDIX A3: Self-assessment Questionnaire .................................................................................. 172

APPENDIX B: Discussion on Logistic Regression as a Validation Tool ................................................ 175

APPENDIX C: Discussion on Obtaining the PHM ................................................................................ 180

APPENDIX D: Goodness of Fit of Regression Models for Predicting Output ...................................... 188

APPENDIX E: Model Properties .......................................................................................................... 192

APPENDIX F: Model and Property Generalization with Initial Conditions ......................................... 194

APPENDIX G: Data Set Used for Empirical Studies ............................................................................. 198

Page 11: System Performance Analysis Considering Human-related Factors

xi

LIST OF TABLES

3.1: Initial Shift-to-Shift and Operator-to-Operator Differences in Failure Occurrences 54

3.2: Summary of Estimated Parameters 55

3.3: Univariate and Bivariate Statistics on Skill Components 57

3.4: Summary of Estimated Parameters for Ring gear 58

3.5: Odds Estimates 61

3.6: Odds Estimates 61

3.7: Combinations of intervention methods to be considered 66

3.8: Values of and the corresponding function value 70

3.9: Value of for the four cases 72

3.10: original variables contained in covariates 73

3.11: Summary of Estimated Parameters 76

3.12: Model Selection Method and Results 81

3.13: Summary of Estimated Parameters 82

3.14: Summary of Estimated Parameters 83

3.15: Variables represented by the covariates in the function 87

4.1: Regression Equation Coefficients, significant at p < 0.01 103

4.2: Learning Curves of Operators’ Skill Components 107

4.3: Operators’ Expected Quarterly Revenues 108

4.4: Operators’ Skill Components 109

4.5: Comparing system revenue under various OA approaches 110

Page 12: System Performance Analysis Considering Human-related Factors

xii

4.6: Operator Assignment for Different Gear Prices 112

4.7: System Revenue under various OA Scenarios 114

4.8: System Revenue under different production times 116

5.1: Regression Equation Coefficients, significant at p < 0.01 133

5.2: Learning curves of operators’ skill components 134

5.3: Expected production output for each operator on each machine 135

5.4: Comparing Alpha’s revenue under various OA approaches 136

5.5: Comparing obtained revenue under approaches of Chapters 4 and 5 138

5.6: Comparing model results with actual production volumes 140

B1: Summary of Estimated Parameters 176

B2: Summary of Estimated Parameters 178

C1: PHM parameter estimation, using all variables related to operator experience 180

level

C2: PHM parameter estimation continued, variable “Experience” is eliminated 181

C3: PHM parameter estimation continued, variable “ExpAna” is eliminated 181

C4: PHM parameter estimation, final model for the “Experience” group 182

C5: PHM parameter estimation continued from Table C1, variable “ExpAna” is 182

eliminated

C6: PHM parameter estimation continued, variable “Experience” is eliminated 183

C7: PHM parameter estimation continued from Table C1, variable “ExpV2” is 183

eliminated

C8: PHM parameter estimation continued, variable “Experience” is eliminated 184

C9: PHM parameter estimation, final model for the “Social” group 184

Page 13: System Performance Analysis Considering Human-related Factors

xiii

C10: PHM parameter estimation, final model for the “Analytical” group 185

C11: PHM parameter estimation, final model for the “binary terms” group 186

C12: PHM parameter estimation, final model, with all four groups combined 187

D1: Indicators of Multicollinearity, Independence of Errors, and Influential Cases 191

G1: Usage of data set in the empirical works presented throughout the dissertation 201

G2: Data set sample for Chapter 3 analysis, using the EXAKT software 202

G3: Data set sample for Chapter 4 analysis, using the SPSS software 203

Page 14: System Performance Analysis Considering Human-related Factors

xiv

LIST OF FIGURES

1.1: General framework of topics in the dissertation 6, 145

3.1: General framework of approach discussed in this chapter 26

3.2: Relationship between function and expected net revenue 36

3.3: Relationship between and , the left and right side of Eq. (7) 38

3.4: Relationship between terms of Eq. (7) 39

3.5: Combinations of variables of interest, impacting expected net revenue 40

3.6: value of variable of interest to result in profit 41

3.7: Courses of action under different scenarios 67

3.8: Optimal Strategy: Stop Machine 68

3.9: Minimum skill set required to achieve positive net revenue 73

3.10: Optimal strategy under various conditions 86

3.11: Minimum skill set required to achieve positive net revenue 88

4.1: Discussions in chapter 3 focused on failure risk analysis given the 93

machine and operator characteristics and operator assignments to machines

4.2: Discussions in chapter 4 to focus on optimal operator assignment, 94

given machine and operator characteristics

4.3: Quarterly production of two operators, with different projected learning 99

curves

4.4: Learning curve for aggregate analytical skill scores of all operators over 104

all weeks

4.5: Learning curve for aggregate social skill scores of all operators over all 106

weeks

Page 15: System Performance Analysis Considering Human-related Factors

xv

5.1: Typical state space for 127

D1: Scatter Plot of Standardized Residuals for checking assumption of 190

random errors and homoscedasticity

D2: Normal P-P Plot of Standardized Residuals for checking assumption 190

of normality

E1: Relationship between and 192

E2: Depiction of a possible history of state where 193

F1: Relationship between and 195

F2: A possible depiction of history of state where 197

Page 16: System Performance Analysis Considering Human-related Factors

xvi

ABBREVIATIONS

In order of appearance:

HR: Human-Related

DM: Decision maker

PSF: Performance Shaping Factors

CPC: Common Performance Conditions

THERP: Technique for Human Error Rate Prediction

HRA: Human Reliability Analysis

CREAM: Cognitive Reliability Error Analysis Method

MMI: Man-Machine Interface

CIT: Critical Incident Technique

PHM: Proportional Hazards Model

MR: Machine-Related

MTBF: Mean Time Between Failure

MTTR: Mean Time To Repair

AIC: Akaike’s Information Criterion

OA: Operator Assignment

LP: Linear Programming

VIF: Variance Inflation Factor

Page 17: System Performance Analysis Considering Human-related Factors

1

1. INTRODUCTION

There are very few systems that can be operational without the need for human

interaction and decision making. This is regardless of the industry, although the degree of

human involvement may differ depending on the industry. There are advantages and

disadvantages associated with human interaction. Humans are versatile and adaptive;

during non-routine or emergency situations, they can improvise and choose the best course

of action. However, humans can experience boredom, fatigue, distraction, panic, or simply

make the wrong decision. When one is looking for ways to improve the performance of a

human-machine system, serving the needs of the humans, such as training or motivational

programs, can be just as effective as providing superior technology or additional hardware.

Alternatively, when one is analyzing the risk of failure, considering the factors that affect

the rate and extent of human error may be as significant as the failure of various hardware

components.

The effect of humans on the performance of a system can be present in any industry, as

long as a process and a unit of output can be defined. An analyst may consider the role of

nurses and the quality of patient care, the operators on an automotive manufacturing

assembly line, or the drivers of haul trucks in a mining operation. In each case, a skilled and

motivated person, working under the right conditions, is likely to have a positive

contribution to the overall system performance. The opposite is also true: a person who

lacks the necessary skills, is unmotivated, or is working under less-than-ideal working

conditions, may diminish the performance of the system. The discussions in this dissertation

Page 18: System Performance Analysis Considering Human-related Factors

INTRODUCTION

2

are more focused on the effect of humans on failure risk analysis. As such, we focus on

some of the factors that can lead to human error, contributing to failure risk. We will

restrict the discussions to the role of those humans affecting the final system output, not

the managing decision makers or the support personnel. For instance, in the

aforementioned automotive manufacturing assembly line example, our focus is on the

machine operator, not the production supervisor, scheduler, or the maintenance

tradesman.

The work presented in this dissertation is better suited for industries such as

manufacturing, mining, and possibly certain areas of health care, where human error is

more frequent, but the consequences are not catastrophic. The same cannot be said for

certain aspects of the nuclear industry where human error is quite infrequent but upon

occurrence, the consequences may cost lives or extreme environmental damage. In such

cases, we may not be able to put a dollar value to the consequences and they are to be

avoided at all costs. But in the manufacturing example, where the error leads to the

machine breaking down for an hour, or producing less parts over the next shift, being

proactive can be gauged by the amount of savings or additional revenue. A certain amount

of risk is therefore tolerable. It becomes a part of a cost-benefit analysis to determine the

maximum value for eliminating that risk. Given the high probability-low consequence

nature of the suitable environments for our work, the cost benefit analysis has a financial

focus. We aim to maximize the net revenue of the system, rather than a safety or accident

prevention focus which would make sense in a low probability-high consequence

environment.

Page 19: System Performance Analysis Considering Human-related Factors

INTRODUCTION

3

The aforementioned cost-benefit analysis is the focal point of the work we present in

Chapter 3. In Chapter 5, we consider machine production over a planning horizon as a

stochastic process and aim to forecast the production output over the period. In industries

such as the nuclear industry, where human error leading to hardware failure is extremely

improbable or infrequent, a stochastic process is used to predict risk. The system output

over the planning horizon would then be a matter of how much resources we allocate to

the process. We will most likely achieve the output we expect to get. Therefore, the work

we present here is suitable for an industry, such as manufacturing, where human errors

occur frequently, and each error has a non-catastrophic consequence, making the system

appropriately represented by a stochastic process.

In the discussion thus far, we have focused on human-machine systems where human

error is frequent, with non-catastrophic consequences. But our work is not solely about

human error. We can also consider a context where there are varying skill levels among the

human participants and we are interested in how well each individual performs their

assigned tasks. In the aforementioned healthcare example, we may have a group of nurses,

each of whom can provide a certain level of service and attend a number of patients. If skill

levels can be compared and there are some nurses who accomplish more tasks per shift

than others, then we may be able to employ a personnel scheduling technique to utilize less

manpower.

Human error is a large realm to analyze; therefore, there are many factors that can affect

its frequency and magnitude. The factors we have focused on in this dissertation are

Page 20: System Performance Analysis Considering Human-related Factors

INTRODUCTION

4

expertise and working conditions, as pertaining to fatigue. Both of these factors, among

others, affect human performance. The challenges in considering human performance in

general, and expertise in particular, are twofold. The first difficulty is measuring the Human-

Related (HR) factors, such as expertise. For a variable to be used in a business decision

model, it must be quantifiable. Traditionally, HR factors have been described qualitatively.

But if a factor, such as expertise, results in additional production, or less equipment

downtime, that factor ought to be expressed quantitatively. The conversion of this

traditionally qualitative attribute to a quantitative value is a challenge. Even though we do

not make a direct contribution to this quantification process in this dissertation, but we

dedicate Chapter 2 to it, as it plays an important role.

The second difficulty is in the use of HR factors in decision models. While a decision

maker, herein referred to as DM, should be familiar with measurements such as uptime,

availability, and throughput, the same cannot be said about the productivity of the human

participant (herein referred to as operator) interacting with the system. Even after an HR

factor is quantified, a DM may not know how to best use the factor in the decision making

process.

Main Contributions:

The most important finding in this dissertation is to prove the significant effect of

human-related factors on performance of a human-machine system. Performance cannot

be analyzed without considering the factors related to the humans interacting with the

machinery. Chapters 3, 4, and 5 cover the main contributions of this dissertation by

Page 21: System Performance Analysis Considering Human-related Factors

INTRODUCTION

5

discussing performance models that incorporate HR factors to help a DM choose a path that

optimizes system performance. This optimization may be based on any number of factors,

such as revenue or system availability. Chapters 3, 4, and 5 provide further detail

supporting the following list of contributions:

1. We include HR factors as covariates in a proportional hazards model (Chapter 3).

2. We develop a model to enable the DM to perform a cost-benefit analysis to choose the

revenue-maximizing intervention method for reducing the probability of failure

stemming from the operators (Chapter 3).

3. We introduce a method to determine a threshold for system profitability based on the

probability of failure and the expected net revenue (Chapter 3).

4. We devise a method to calculate the minimum level for the HR factors included in the

proportional hazards model in order to ensure system profitability (Chapter 3).

5. We create a methodology to optimally assign operators to machines based on the

sensitivity of the machine to HR factors as well as the operators’ current and forecasted

characteristics (Chapter 4).

6. We use a Markov chain approach to forecast production output, considering operator

learning (Chapter 5).

A more detailed discussion of the list of contributions is provided in Chapter 6.

The general framework of the work presented in this dissertation appears in Figure 1.1:

Page 22: System Performance Analysis Considering Human-related Factors

INTRODUCTION

6

Once the main HR factors are identified and quantified, they can be used as variables in

the decision-making process to optimize system performance. The first type of analysis

considered in Chapter 3 has a short-term scope. We develop a model to measure the failure

risk stemming from HR factors and then provide a cost-benefit analysis for choosing

amongst the various intervention methods to reduce risk. Chapters 4 and 5 examine a long

planning horizon. Chapter 4 is focused on the optimal assignment of the human participants

to tasks they are best suitable for. The goal is to assign the personnel in order to maximize

total revenue. This is done while forecasting the output and learning curves of the various

human participants over the planning horizon. In Chapter 5, we use learnings from Chapters

3 and 4 to perform an analysis that provides a decision maker with an output estimate for

each human participant over a planning horizon. There can be many usages for this analysis,

such as cost-benefit analysis of training programs, personnel assignment, inventory

management, and work scheduling.

Figure 1.1: General framework of topics in the dissertation

Quantification

Analysis of failure

risk: short term

intervention

Human-related Factors

Performance

of human-

machine

systems

Optimal operator

assignment

Long term planning

Production

forecasting: risk of

failure and operator

learning

Page 23: System Performance Analysis Considering Human-related Factors

INTRODUCTION

7

To end this introductory chapter, we define several terms which we have used in this

chapter and will continue to use throughout this dissertation. These terms may be generally

understood but need to be defined for our specific context. They are as follows:

Failure: The condition when the machine is unavailable for production due to an

unplanned event. Within the context of failures caused by the operators, failure is the

consequence of any operator-related mistake that takes the machine out of production.

Risk of failure: In many cases, risk is thought of as consequence multiplied by the

probability of occurrence. But in our case, we simply mean it to be the probability of

failure.

Error: incorrect performance of an operator as required for a particular task.

Intervention: a decision-maker, stepping in proactively, to reduce the probability of

machine failure.

Operator expertise: 1) the ability to operate the machine; 2) the ability to troubleshoot

when a problem is at the operator level and the presence of mind to ask for

maintenance tradesman’s assistance when the problem needs further technical

expertise.

System: short for human-machine system, consists of human resources as well as

physical assets and hardware, working together to achieve the unit of output, that is the

objective of the business establishment.

Page 24: System Performance Analysis Considering Human-related Factors

8

2. QUANTIFICATION OF HUMAN-RELATED FACTORS

Many human-related factors are subjective. A machine is either running or not. But a

person’s ability to perform a task has various levels. In addition, no two people are exactly

the same. However, if we are to use human-related factors in mathematical models for

performance evaluation of systems, we need to turn the qualitative factors into

quantitative ones. Where experts are reliable and readily available, it is quite prevalent in

the literature to find performance evaluation and modeling using expert judgement (Kariuki

and Lowe, 2007; Landy and Trumbo, 1975). However, this is certainly not the only method.

In the next section, we present an overview on some existing and common methods.

2.1. Literature review on quantification techniques

Many factors influence human performance in a complex human-machine system

(Cacciabue, 2000); these factors can be both internal and external to the human operator. A

global view of these factors must include a wide range such as:

1) The human characteristics, such as physical, psychological and mental conditions,

stress and fatigue levels;

2) Working environment and the equipment state, such as operational conditions,

design, maintenance, and availability and reliability of equipment;

3) Managerial and organizational factors, including the safety culture and policy,

management commitment, procedures and training, risk assessment, and incident

Page 25: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

9

analysis. Other important factors relate to the cultural environment, such as the national

culture and the societal values.

The field of human reliability analysis was initially started to perform probabilistic safety

assessments. It was developed to be an approach that considers all possible accident

scenarios in order to probabilistically evaluate the overall system safety (Kim and Jung,

2003). To perform a human reliability analysis, an analyst must identify those factors that

are the most relevant and influential in the jobs studied. The labels used for these factors

differ among different approaches; they are most often called Performance Shaping Factors

(PSF), or other close variations, such as Performance Influencing Factors or Performance

Influencing Context (Kim et al., 2004). In some recent methodologies, their name changes

completely to Common Performance Conditions (CPC) or Error Forcing Context. The basic

difference in these methodologies is the way those factors are used: they characterize the

circumstances under which human actions or tasks take place and they appear very early in

the process of quantification in identifying the PSFs which affect the final outcome of the

Human Reliability Analysis.

Great effort has been given to define potential PSFs in different approaches. Technique

for Human Error Rate Prediction (THERP) is one of the most widely used approaches in

Human Reliability Analysis (HRA), (Swain and Guttman, 1983). It was conceived mainly for

the nuclear industry but it has been applied in other industrial contexts as well. The list of

PSFs in THERP is quite exhaustive. However in the quantification of THERP, and depending

on the application, a very limited number of PSFs is actually used, the most common of

Page 26: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

10

them being the available time, the level of stress, the task type and the level of experience.

Since THERP is the most widely used method and in many cases, the one that served as the

initiating ground for many other methods, its categorization is described in detail.

The categorization of PSFs in THERP includes external and internal factors. External PSFs

are characterized as follows:

1. The situational characteristics:

architectural features,

quality of working environment,

work hours and work breaks,

shift rotation and night work,

availability/adequacy of special equipment/tools,

manning parameters,

organizational structure and actions by others,

rewards, recognition and benefits.

2. The task and equipment characteristics:

perceptual requirements,

motor requirements,

control-Display relationships,

anticipatory requirements,

interpretation requirements,

decision making, complexity,

frequency and repetitiveness,

task criticality,

long and short term memory requirements,

calculation requirements, feedback,

Page 27: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

11

dynamic versus step by step activities,

team structure,

man-machine interface.

3. The job and task instructions:

written and non-written procedures,

written or oral communications,

cautions and warnings,

work methods,

plant policies.

4. The psychological stressors:

suddenness,

duration of stress,

task speed,

task load,

high jeopardy risk,

threats,

monotonous work,

conflicts,

distractions.

5. The physiological stressors:

duration of stress,

fatigue, pain,

hunger,

temperature extremes,

radiation,

G-force extremes,

oxygen insufficiency,

Page 28: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

12

vibration,

disruption of circadian rhythm.

Internal PSFs are categorized into the following factors: previous training and

experience, state of current practice or skill, personality and intelligence variables,

motivation and attitudes, knowledge of required performance standards, gender

differences, physical condition, influence of family, and group identification.

Additional to the external and internal PSFs there may be some ergonomic problems

such as poor design and layout of controls and displays, poor labeling of controls and

displays in the control room, inadequate indication of plant status, presentation of non

essential information and Inadequate labeling outside the control room.

Other Human Reliability methods use frameworks based on the assessment of

interactions and the quantification of their impact on operators’ actions and performance.

A representative methodology of that type is Systematic Human Action Reliability

Procedure, by Hannaman and Spurgin (1984). The objective of this framework is to help in

defining the types of interactions that are important to risk or performance analysis and to

enable the analyst to incorporate them into the system analysis task.

Another family of methods is based on behavioural science and the Rasmussen model

that classifies operators’ behaviour into Skill-, Rule- and Knowledge-based (Rasmussen,

1982). Representative methodologies are Generic Error Modeling System, (Reason 1987),

and Systematic Human Error Reduction and Prediction Approach, (Embrey 1986), which

offer a behavioral classification of human errors. These are techniques that are based on

Page 29: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

13

series of simple questions, usually presented in flowchart form and they tend to determine

whether behaviour is likely to be skill, rule, or knowledge-based.

Of the many methods that have been developed over the years, the one that is best

suited for our type of analysis is Cognitive Reliability Error Analysis Method (CREAM),

(Hollnagel, 1998). Unlike first generation human reliability techniques that were all about

error analysis, CREAM’s focus is on performance prediction. This method can be specifically

tailored to the contextual situation (Stanton et al., 2005).

In the place of PSFs, the CREAM method uses a group of factors, or CPCs, to define sets

of possible error modes and probable error causes. The CPCs provide a comprehensive and

well-structured basis for characterizing the conditions under which the performance is

expected to take place. By using CREAM with its nine CPC families, one can focus more on

the plant-related and people-related factors. The performance prediction process continues

with the use of the taxonomy of Kim and Jung (2003). This work helps the analyst to

recognize important parameters that should be taken into account in each CPC category.

The taxonomy of Kim and Jung allows the analyst to take into account all the possible

performance shaping factors known from the literature and to combine them into a single

and global list.

The CPCs include factors that relate to the person (adequacy of training and experience,

time of the day – Circadian rhythm), to the working context (working conditions, adequacy

of Man-Machine Interface), to the company (adequacy of organization, availability of

procedures and plans), to the specific task (number of simultaneous goals, available time)

Page 30: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

14

and to the team work (crew collaboration quality). The CPC’s are briefly presented as

follows:

Adequacy of organization: defines the quality of the roles and responsibilities of

team members, additional support, organization communication systems, safety

management system, instructions and guidelines for externally oriented activities,

role of external agencies.

Working conditions: describes the nature of the physical working conditions such as

ambient lighting, glare on conditions screens, noise from alarms, interruptions from

the task.

Adequacy of Man-Machine Interface (MMI) and operational support: defines the

MMI in general, including the information available on MMI and control panels,

computerized workstations, and operational support provided by decision aids.

Availability of procedures and plans: describes procedures and plans and includes

operating and emergency procedures, familiar patterns of response heuristics, and

routines.

Number of simultaneous goals: enumerates the number of tasks a person is

required to pursue or attend to at the same time (i.e., evaluating the effects of

actions, sampling new information, assessing multiple goals).

Available time: depicts the time available to carry out a task and corresponds to how

well the task execution is aligned with the process dynamics.

Page 31: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

15

Time of day: denotes the time of day (or night) and describes the time at which the

task is carried out, in particular whether or not the person is adjusted to the current

time (circadian rhythm).

Adequacy of training and experience: describes the level and quality of training

provided to operators to teach new technology, or refresh old skills. It also refers to

the level of operational experience.

Crew collaboration quality: Declares the quality of the collaboration between

crewmembers, including the overlap between the official and unofficial structure,

the level of trust, and the general social climate among crewmembers.

2.2. Usage of Critical Incident Technique

Despite the abundance of literature on HRA methods, the scope of their application is

mostly in high-risk systems, such as nuclear power plants (Elmaraghi et al., 2008). In

addition, most HRA methods were originally developed and used for probabilistic safety

assessment of accidents. There is a lack of research that applies HRA methods in assessing

the probability of errors for direct workers in a manufacturing context (Bubb, 2005). This

gap in the manufacturing context, along with the common practice in the realm of

performance measurement to depend on judgmental indices, leads us to rely on expert

knowledge in assessing the performance of operators. It should be noted that the usage of

performance measurement techniques being discussed is for continuous variables, such as

operator skill. There are other human-related factors, such as being on a night shift (or not)

or certain days of the week, that do not require quantification and can simply be

represented by using indicator variables. At this point, we should define the term “expert”

Page 32: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

16

as we use it in this chapter and throughout the rest of this dissertation. Experts are people

within the system who are aware of the aims and objectives of a given job and who see

people perform the job on a frequent basis. They are in a position to accurately judge the

performance of others.

The HRA methods discussed can provide an analyst with the probability of an event, such

as an explosion or a breakdown. This can be equivalent to what we are pursuing: the failure

of a machine. But this is machine-centric, not operator-centric. The method would provide

the probability of the breakdown of the machine, given the context. The operators working

on the machine are considered as part of the context, along with procedures, MMI, and

many other factors previously discussed. An expert measures the characteristics of the

company, such as “working conditions” or “availability of procedures”, to get an estimate

for the probability of error of a typical operator. These methods may not be sufficient for

analyzing the risk of failure of an individual operator.

We are not always interested in the risk of failure. At times, we may wish to forecast the

production output, in which case, we would be interested in quantifying the ability of how

well the operator can run the machine. We may be interested in the relative ability of one

operator compared to another. Therefore, the HRA methods provide us with a value for the

probability of the machine failure, whereas we are interested in measuring the ability of the

operator in performing the task.

Lastly, it should be mentioned that, even with the HRA methods, experts are involved.

Experts may be used to develop the standard tables, which lead to the probability figures.

Page 33: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

17

Experts develop anchor points within the categories to be assessed to drive our subjectivity.

And experts are used to perform the actual assessments of system characteristics.

In the discussions throughout this dissertation, we use an approach based on the Critical

Incident Technique (CIT) to quantify the technical and behavioural components of operator-

related factors. CIT consists of a set of procedures that are used to collect observations of

the human behaviour deemed to have a critical significance on the phenomenon being

analyzed. Each observed incident must meet certain pre-defined criteria. The CIT analyst

uses the set of observed incidents to solve practical problems (Davis, 2006).

In our analysis, CIT is an appropriate technique to use because it enables us to quantify

technical and behavioural components of factors related to the operators (Levine et al.,

1980; Levine et al., 1983). For us, a critical incident can be defined as a task that ensures the

success or failure of machine operation. Each feature of the technical and behavioural

component may affect the phenomenon differently. We wish to observe each mode and

need to have procedures in place to standardize how we perform the observations. We

need to have pre-defined criteria for assessing how well the operator performs the tasks

expected of him/her.

CIT has various stages: determining what constitutes a critical incident, making

observations from the participants, identifying the significant issues, decision-making on

possible solutions, and evaluation. In our analysis, we only make use of the first three

stages. We use statistics-based models, such as the Proportional Hazards Model, logistic

regression, and Markov chains, to perform the analysis and propose solutions for the

Page 34: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

18

system, replacing the latter stages of CIT. Flanagan, the founder of CIT, mentions having

expert observers as a mode of collecting data (Butterfield et al., 2005; Flanagan, 1954). He

also mentions supervisors and experts in the field as possible observers. One of the ways

Flanagan advocated for data collection is questionnaires filled out by experts. The tools we

use when applying the CIT are questionnaires, observational studies, and technical tests.

The system experts we use, supervisors and manufacturing engineers, complete

questionnaires on technical and behavioural questions about the operators. System experts

also perform observational studies on the operators. Finally, the operators are also involved

as they complete a test that evaluates their technical knowledge. These tools are discussed

next in more detail.

2.2.1. Selection of Experts

Flanagan, the founder of CIT, states that the CIT technique requires observers who are

aware of the aims and objectives of a given job and who see people perform the job on a

frequent basis (Butterfield et al., 2005). System experts are interviewed about their

observations of the critical requirements of the job.

If the analyst decides to include the operators during the application of the CIT, the role

of the operators should be limited to technical tests. Studies have found that self-ratings

were more lenient than supervisory ratings (Conway and Huffcutt, 1997) or external

observations (Davis et al., 2006). Research using self-reports also runs into the problem of

social desirability, so labelled because questionnaire items may prompt responses that will

present the person in a favourable light. As a result, it may be advisable to limit any

Page 35: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

19

assessment type questions to a factual/technical type only. This will limit the tests taken by

the operators to those with right-or-wrong or factual answers and will not contain any

behavioural or performance-prediction questions. Furthermore, similar to self-ratings, peer

ratings also appear to be more lenient than supervisory ratings (Borman, 1974). Thus, it

may be inadvisable to ask operators to rate the performance of their colleagues.

Individuals with more knowledge of the particular job’s details have been found to more

validly predict future performance (Wagner and Hoover, 1974). The management level

closest to the operators can be considered to be experts intimately familiar with the

human-machine system.

Since different raters may have different perspectives on performance that influence

their ratings (Borman, 1974), it may be better to use multiple experts, thereby reducing the

problem of same source variance. Problems arise when measures of two or more variables

are collected from the same respondents and the attempt is made to interpret any

correlation among them. This is the well-known problem of common method variance.

Because both measures come from the same source, any defect in that source

contaminates both measures, presumably in the same fashion and in the same direction.

Podsakoff and Organ (1986) discuss a method for dealing with common method variance

by separation of measurement, which in a way, is to collect the data at different times,

different locations, or through different media. Thus, data collection reported later in this

document was repeated twice within a month, performed at different times of the shift.

Page 36: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

20

The assessment process begins with consulting the system experts to identify the top HR

factors affecting system performance. We do not seek an all inclusive list; Parker et al.

(2001), put forth the discussion that a universal list of factors may be infinite. But we can

start with the initial selection according to the system experts and treat it as an open-ended

process where additional factors may be identified, prioritized and analyzed for a

progressively more comprehensive analysis. However, it would be possible to identify

categories of variables to be adapted and applied differently according to context.

The top HR factors identified can be checked against the list of “key human-centered

factors affecting worker performance” in the comprehensive review performed by Baines et

al. (2005).

2.2.2 Data Collection Tools

Similar to a study by Glick et al. (1986), we use three tools in our job analysis method.

These tools are: observational studies, questionnaires, and self-assessments. Examples of

these tools, as applied to our empirical study, appear in Appendices A1, A2, and A3,

respectively. We use multiple experts to answer questionnaires on performance and

behaviour prediction. We use a different set of experts to perform observational studies on

operators’ technical skills.

Whether one uses questionnaires, observational studies, or any other data collection

tools, there are a number of guidelines that should be followed in writing the individual

questions/items. Statements in the body of the questionnaires and the self-test are to be

kept simple, as short as possible, and with a language familiar to target respondents.

Page 37: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

21

Keeping the questionnaire short is an effective means of minimizing response biases caused

by boredom or fatigue. We should avoid leading questions, as well as those that are

negatively worded, as they may result in biased responses. Issues related to the

homogeneity of items within each variable of measurement (Harvey et al., 1985), as well as

the number of response categories (Ornstein, 1998; Jenkins and Taber, 1977; Lissitz and

Green, 1975), have to be carefully considered in the design.

When the experts are rating an operator, the forced choice scenario is utilized. A study

by Ornstein (1998) refers to other works and states the closed form question to be superior

to the open form. In this scenario, the rater is required to choose from among a set of

alternative descriptors, normally four items, some subset that is most characteristic of the

ratee.

To provide the same frame of reference for the various assessors, we use anchoring.

Using the forced-choice format for the responses, we provide a numerical value to

complement the subjective statements provided in the response. This eliminates the

subjectivity for the respondent when the response may provide an opportunity for different

interpretations. Rather than just stating “low production level” as a response category, one

can state “low production value, less than 100 pieces per hour”. Studies have found that

scale reliability improves with increased anchoring (Lam and Klockars, 1982; Bendig, 1952a,

1952b, 1953).

The test/retest method is used to estimate the reliability (each section of the analysis is

repeated twice). For the various sub-sections, namely operator self-tests, expert

Page 38: System Performance Analysis Considering Human-related Factors

Quantification of Human-related Factors

22

questionnaires, and observational studies, we compare the two separate measurements

and compute a correlation factor.

2.3. Concluding remarks

We describe a general method to quantify HR factors, many of which are often

subjective characteristics. The approach, based on the Critical Incident Technique, is

certainly not the only method available, but it is one that works well in our application of

quantifying the technical and behavioural aspects of human performance when reliable

experts are readily available.

There are many issues at play when using the proposed method, all of which may play a

major role in the results. It is important to select the appropriate subjects as the experts

who will be performing the assessments. The type of tool to use can also play a major role.

Examples of tools introduced in this chapter have been questionnaires and observational

studies. The method of delivery and data collection can be quite important. However, one

of the most important elements is the design of the elements used within the tools, such as

the actual questions (and answers) on the questionnaires. Careful design is necessary for

the wording of the questions, such as avoiding negatively-worded or leading questions. The

design of the answers remains important whether we want to ask open-ended questions or

use a closed-answer design to force the experts to select one of the pre-determined

answers. Lastly, there are techniques to use in the design of the tools to reduce common

method bias. Great care should be exercised when devising the method behind the

quantification process.

Page 39: System Performance Analysis Considering Human-related Factors

23

3. FAILURE RISK ANALYSIS

Consider the scenario where a set of operators with varying personal characteristics

have been assigned to a set of machines. In this chapter, we model changes in the

performance of these machines, given the effect of skill and fatigue, caused by shift work

and different days of the week. We analyze the risk of failure of the machines using the

Proportional Hazards Model (PHM), a common tool in maintenance optimization. The most

common usage of the PHM involves the presence of machine-related factors only. Once we

enhance the usage of this model to include human-related factors as covariates, it would be

natural to wonder whether the model may have any significant managerial impact. In cases

where the human decision maker is responsible for a significant part of the overall risk, we

discuss examples of proactive intervention measures a DM may take to mitigate risk. In

addition, we develop a revenue model that provides a cost-benefit analysis for each

intervention measure considered.

Connection to Previous Chapter

Many HR factors, such as skill, are easily described qualitatively. But if an analyst is to

utilize a mathematical model, such as the PHM, to calculate the risk of failure, the HR factor

needs to be expressed quantitatively. The methods described in Chapter 2 can aid the

analyst achieve this quantification. This is a necessary first step before HR factors can be

used in mathematical and statistical models to help the DM improve system performance.

Main Contributions of this Chapter

1. We include HR factors as covariates in a proportional hazards model.

Page 40: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

24

2. We develop a model to enable the DM to perform a cost-benefit analysis to choose the

revenue-maximizing intervention method for reducing the probability of failure

stemming from the operators.

3. We introduce a method to determine a threshold for system profitability based on the

probability of failure and the expected net revenue.

4. We devise a method to calculate the minimum level for the HR factors included in the

proportional hazards model in order to ensure system profitability.

3.1. Introduction

The performance of a human-machine system is a factor of the performance of the

hardware, as well as the correct operation of the hardware by the operators. One indicator

of performance is reliability, and to assess reliability, we can perform a failure risk analysis.

There are many reliability and failure risk analysis models that deal with machinery.

However, there are few that incorporate the role of the human operators on uptime and

overall performance.

The first step in our analysis is to have a model to incorporate both Machine-Related

(MR) and HR factors. This can provide us with an all-encompassing assessment of failure

risk. To achieve this, we use the PHM, a commonly used tool to model the time of failure of

equipment (Jardine et al., 1989; Vlok et al., 2002).

Once we evaluate the risk facing the system, we may choose to intervene and reduce or

eliminate it. Any intervention method pursued should have the benefit of reducing failure

risk; it would also have the disadvantage of incurring a direct cost. Given the trade-off

Page 41: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

25

between risk reduction and direct cost, we develop a revenue model to perform a cost-

benefit analysis for choosing the best intervention method. Given the machine factors, such

as working age; operator factors, such as skill level; and the direct cost of the intervention

method as well as its risk reduction factor, the best intervention method is selected as the

one that results in the highest system revenue. In the absence of an analytical method for

choosing among intervention methods, subjectivity and personal biases enter into the

decision-making process, distancing it from being an evidence-based process. Providing a

systematic tool to a DM to choose the optimal intervention method is the main

contribution of this chapter.

In addition to the cost-benefit analysis, the revenue model can serve in determining the

minimum requirements of various factors. The threshold for making a profit can be

determined and a minimum set of values for the factors exceeding this threshold can be

calculated. For example, the minimum skill level for a new operator can be calculated prior

to certifying him/her to operate a machine. Not achieving this minimum skill level may put

the DM in a situation where, in the absence of a better operator, shutting the machine

down would be the best option.

Figure 3.1 summarizes the aims of this chapter, discussed in this introductory section:

Page 42: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

26

Figure 3.1: General framework of approach discussed in this chapter

3.2. Literature Review

There has been much research on assessing human reliability and incorporating it into

the overall risk analysis. But the literature is sparse when it comes to performance

measurement models that incorporate human-related risk assessments. Barroso and

Wilson (1999) consider a manufacturing environment and focus on estimating the overall

effect of human reliability. However, their approach is not risk-based, but rather focuses on

identifying sources of human error and reducing them. Horberry et al. (2010) discuss

human factors and their effects on operations and maintenance in a mining context but do

not attempt failure prediction. Similarly, Kolarik et al. (2004) develop a model to monitor

and predict an operator’s performance using a fuzzy logic-based assessment. But the

Quantify

operator skill

Enhance PHM that

includes HR covariates

and interactions

between covariates

Develop revenue model,

using the PHM, to make a

trade-off between risk-

reduction and cost

Compare outcome of various

decisions and choose

intervention method resulting

in highest net revenue

Calculate minimum level of HR

factors that ensure risk

threshold is not exceeded and

system can be profitable

Page 43: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

27

purpose of their work is to solely provide a human reliability assessment, without providing

any methods for risk reduction. Blanks (2007) discusses the need for improving reliability

prediction, paying special attention to human error causes and prevention, but does not

mention any predictive techniques for human reliability.

Zimolong and Trimpop (1994), and Dhillon and Liu (2006) focus on the maintenance

workforce performing repair work at times when machines are not being used for

production purposes. Reer (1994) discusses human reliability in emergency situations. Our

discussion focuses on the production workforce during the operation of the machines. Our

emphasis is not on decreasing the Mean Time to Repair but on improving the Mean Time to

Failure. A further distinguishing feature of our work is its proposal for managerial

interventions, or proactive measures, to deal with the risk stemming from the operators in

the human-machine system. Our approach is novel and we have not found any similar and

previously published work in the literature. Vrignat et al., (2012), discuss an approach,

where they draw observations from the process to generate an availability indicator to be

used by a DM to plan actions dynamically. The authors also mention the PHM as a tool. Our

work is also helping the DM to plan actions dynamically. A major difference is that, in our

case, the observations from the process include HR factors. Castanier et al., (2003), discuss

a continuously deteriorating machine where each maintenance operation makes sense at

various stages. The DM can choose to run-as-normal, perform preventive repairs, or

preventive replacement. Each has the benefit of improving the system a certain amount

and each has the cost of taking the system out of production for certain duration. There is a

certain element of cost-benefit analysis in this work that is similar to ours. But there is no

Page 44: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

28

mention of human-related factors either. Neither with the work of Vrignat et al., (2012) nor

with Castanier et al., (2003) is there a deeper focus on the specific causes of degradation.

All causes leading to machine degradation are combined together. The cost benefit analysis

is not related to the specific risk culprits.

There are approaches, such as that of Stewart and Grout (2001), with a focus on error-

proofing techniques and physical devices to prevent human mistakes. Burkolter et al. (2009)

propose personnel selection criteria to minimize risk during the operation. Karaulova and

Pribytkova (2009) acknowledge the human role within the overall reliability analysis but

simply make generalized comments, such as better ergonomic design or improved human-

machine interface, as means of risk reduction. Finally, while Blumenfeld and Inman (2009)

note the impact of inferior operator skill on quality and performance, their scope is limited

to the comparison of systems with or without production management devices. Our scope

is different from all of these in that we propose intervention methods to deal with the HR

risk in the short-term planning horizon. In addition, we supply the DM with a cost-benefit

analysis of intervention methods. There is no other work in the literature that enables the

DM to choose the optimal course of action in maximizing revenue by minimizing failure risk

stemming from the operators in a human-machine system.

3.3. Evaluating Intervention Methods for Human-Related Risk

The PHM is a common tool for failure risk analysis and this is especially true when the

PHM is parametrized using the Weibull baseline (Jardine and Buzacott, 1985). One of the

reasons for the frequent usage of Weibull form of the PHM in this context is the versatility

of the hazard function obtained by varying the scale and the shape parameters (Jardine et

Page 45: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

29

al., 1989). The PHM relates the time of an event, such as failure, to a number of explanatory

variables known as covariates (Lugtigheid, 2004). Several factors, including the equipment

age or specific system characteristics, may influence the equipment’s hazard rate, which is

the rate of transition out of a non-failed state to a failed state.

The hazard rate can be affected by factors specific to the machine, the environment, or

the human operators within the human-machine system. Traditionally, the PHM has been

used with quantifiable MR covariates. However, this usage can be expanded by including

non-MR type factors (Centrone et al., 2010; Kiassat and Safaei, 2009).

The general form of the Weibull PHM is defined as follows:

. (1)

The hazard rate, h(t), is proportional to the (instantaneous) conditional probability of

failure at time t. The first part of the equation, called the baseline hazard function, is

sensitive to the age of the equipment. It contains two parameters, β and η, the shape

parameter and the scale parameter, respectively. In the absence of covariates, scale

parameter provides the characteristic life of approximately the 63rd percentile of failure

data. When the model includes covariates, η becomes a balancing figure.

The second part includes the explanatory variables, , which influence the hazard

rate. Each explanatory variable, also called a covariate, represents a monitored condition

datum at the time of inspection, t, such as parts per million of iron in the oil sample taken

on a particular day, or the skill score of an operator as measured on a certain date. The

coefficient of covariate determines the covariate’s degree of influence on the

Page 46: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

30

overall hazard rate. The fact that the model accommodates the inclusion of HR covariates,

in addition to MR factors, makes the PHM an excellent model for our analysis.

We develop a PHM that includes HR factors as covariates so that we can calculate their

effect on failure risk. This can aid us in the calculation of expected uptime as well as the

probability of failure. Both are essential in developing the revenue model that determines

the optimal intervention method for mitigating risk stemming from the operators.

The PHM’s general form in Eq. (1) can be expressed differently (Eq. 1a). The PHM

covariates have a general form and can be the original variables of a measurement, or a

function of the original variables, including their interactions.

, (1a)

where

;

represents the vector of the original variables and

represents the covariate coefficients. The vector dimension, n, may be different from the

number of covariates, m, because there may be additional interaction terms. In some cases,

a covariate may represent an original variable on its own, such as

. In other cases, a covariate may represent an interaction term, such as

.

3.4. Developing an Evaluation Model for Intervention Methods

We can now proceed to develop a general mathematical model, showing expected risks,

costs, and revenues for the next planning horizon, and express it as a revenue function. It

Page 47: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

31

can be maximized to provide the optimal course of action, resulting in maximum

profitability of the system. The PHM discussed in Section 3.3, and in particular, its covariate

portion, will be a fundamental part of the revenue function to be developed.

There are a few boundary conditions for our approach. We have a number of people

operating several machines; examples are operators of machines in a manufacturing

environment or drivers of haul trucks in a mining operation. The machines are not 100%

automated and random failures may stem from mechanical/electrical issues as well as

human error. Machine’s probability of failure due to MR and HR factors can be measured,

using the PHM or other appropriate failure analysis methods. Failure cause and downtime

duration can be captured, as well as people’s specific involvement with each machine.

Following the failures caused by the operators, repairs are expected to bring the machine

back to an as-good-as-new state (Gasmi et al., 2003). This assumption serves the purposes

of this paper and is made to simplify the discussions. Furthermore, there is a DM in the

system who can intervene by using our proposed method to find the optimal trade-off

between risk and revenue.

The general form of the net revenue function associated with the proposed method is as

follows:

= Revenue–Cost

= – – ,

where

Page 48: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

32

: net revenue over time interval [t, t+s] where s represents the length of interval and

represents machine age since the most recent repair. is the

length of the planning horizon.

: revenue scheme, per unit time, specific to intervention method

p : number of intervention methods

:

T: random variable representing failure time, measured from the most recent repair

=

: repair and maintenance cost associated with failure

=

: cost function of intervention method

The net revenue function depends on three factors: (1) the machine’s survival up to the

beginning of the planning horizon; (2) the set of PHM covariates, Z; and (3) the matrix of

decision variables, A, resulting from the available intervention methods. The dependency

on these three factors will be explained next.

The various decision variables affect one or more covariates. Their effects can be

expressed as the following matrix:

,

Page 49: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

33

where the entity is the effect of the ith decision variable on the jth covariate. The number

of decision variables depends on the number of intervention methods available to the DM

and is independent of the number of original variables.

We assume that the overall effect of different decision variables on each covariate is

additive. For each covariate, , the cumulative effect of the various decision variables is

represented by .

The function, ψ, is an update to the covariate part of Eq. (1a). It is a function of the

covariates and the decision variables and takes the following general form:

, (2)

where

is the type of decision variable that affects the hazard rate as a whole, not any individual

covariate(s) directly.

After replacing the covariate part of Eq. (1a) with the extended form of , the final

form of Eq. (1a) is as follows:

. (3)

We can now calculate the conditional expected value of the net revenue function:.

E

,

In general, is normally calculated as

, but in the case of , the value has an

upper bound of a. Therefore,

can be considered as two parts, a continuous part when ,

giving us

, and a discrete part when , giving us . As a result,

. By using integration by parts to the previous formula, we arrive

Page 50: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

34

at

. We apply this concept to to express the previous equation as

the following:

E

, (4)

where is the conditional reliability:

.

The variable u is introduced to represent the time of failure and to emphasize that is a

function of the upper bound of the interval .

We make a simplifying assumption here to help us with the calculations. Each

changes over time; but since we consider short intervals, we can think of this to be a step

function, thus constant over the short interval [t,t+s]. The result of this assumption is the

following:

, .

For the baseline cumulative hazard,

, and

, and

, and given and above, Eq. (4) is expressed as

follows:

, (5)

The expected uptime and the probability of failure are functions of Z and A.

Eq. (5) is the expected net revenue function and includes the revenue scheme, as well as

the cost, associated with each intervention method. A DM can use this function as a cost-

Page 51: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

35

benefit analysis to choose the intervention method that results in the highest net revenue

for the system.

It should be noted that in cases where the PHM only uses MR factors, we can calculate

an optimal threshold for the hazard rate based on the expected cost of preventive

replacement, expected cost of failure replacement, and expected cycle length (Vlok et al.,

2002). When the hazard rate exceeds this pre-determined threshold, it alerts the

maintenance decision-maker to take proactive steps, as there may be a high probability of a

functional failure. The nature of these proactive steps depends on whether the risk is

related to the machine hardware or the person operating it. In the decision model we

propose in this chapter, we do not use a threshold. We use a cost benefit analysis that

includes a “do nothing” decision, among others. If the DM’s intervention is beneficial to the

system, it gets implemented. If the risk is low and insignificant, the “do-nothing” approach

is warranted.

3.5. Determining the minimum skill level

Working with Eq. (5), we can determine the minimum skill levels necessary for the

operators if we are to avoid planning horizons with non-positive net revenues.

From Eq. (5), we can see that when increases, decreases.

More risk results in a greater number of failures, in turn, leading to less net revenue. There

should be a minimum risk level, , above which we will no longer be profitable (Figure

3.2). Therefore, if , then . The first challenge is to

find .

Page 52: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

36

Figure 3.2: Relationship between function and expected net revenue

First challenge: finding the profit threshold

We consider the case , and set Eq. (5) to zero to calculate . Since we are

considering the expected profit for a fixed set of decision variables, we denote the revenue

and the direct cost functions as and , respectively. We use the original definitions of

and for the baseline cumulative hazard.

, (6)

This analysis can take two different scenarios. We will first consider the scenario where

there are no direct costs for decision variables, . The non-zero scenario will follow.

, (7)

$

Page 53: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

37

Let

, and

.

One can think of as the normalized benefit of an initiative. Similarly, one can

think of as the normalized cost of an initiative.

The term can be treated as a variable; it is the risk value for a given scenario. Other

terms, t, s, CF, and are also constants.

The function starts at s, when , that is , and has an asymptotic

lower bound at zero when , that is . Negative values of are not

considered as the function represents risk. is a monotonically decreasing

function when increases.

starts at zero, when , that is , and has an asymptotic upper

bound at

as , that is

. It is a monotonically increasing function

when increases.

The monotone nature, and the fact that the lower bound of one function is below the

upper bound of the other, ensure a unique solution for . Figure 3.3 shows the

relationship between and . Values less than result in positive net

revenue.

Page 54: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

38

Figure 3.3: Relationship between and , the left and right side of Eq. (7)

We now consider the scenario of having direct costs associated with decision variables,

.

The function is unchanged from scenario 1. The function , when , is

equal to

. This leads to two cases, depending on the relative size of the two terms. As

,

. This term does not play a role in finding a unique solution for .

o Case 1:

. This results in no solution for as shown in Figure 3.4a. We can

interpret this as a very large direct cost of intervention methods, one that

overshadows the benefits. As these intervention methods will be very costly, they will

not be adopted by the DM.

o Case 2:

. This ensures a unique solution for , as displayed in Figure 3.4b.

Value of

Fun

ctio

n V

alu

e

S

0

Page 55: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

39

3.4a: Case 1: no solution exists for 3.4b: Case 2: unique solution exists for

Figure 3.4: Relationship between terms of Eq. (7)

Second challenge: finding the minimum skill level

Thus far, we have determined the threshold for the expected net revenue function to be

positive. Going a step further, we now determine the minimum skill level to ensure we do

not exceed that threshold and that the system is profitable.

If , we need to have , or

(8)

It should be noted that even though is related to the decision variables, it does not, by

definition, directly impact any variables or covariates. Since we are interested in solving

inequality 8 for a specific variable, such as analytical skill, we keep on the right side.

Consider an example where , and we are only interested in variables

x2 and x3. Let us also assume there is a total of four significant covariates in the PHM and

that only the first three contain our variables of interest, and : , ,

and , so inequality 8 reduces to the following:

Fun

ctio

n V

alu

e

0

Value of

Fun

ctio

n V

alu

e

Value of

0

Page 56: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

40

If we express the above inequality as , then Figure 3.5 shows the

influence of and on expected net revenue. Points falling on the line represent zero

expected profit.

Figure 3.5: Combinations of variables of interest, impacting expected net revenue

If we assume and to be the type of variables whose improvement affects risk

negatively, for example, Analytical skill, then and are both negative. This will result in

the swapping of the positive and negative regions in Figure 3.5.

If we are interested in only one variable, the above diagram changes from a region of

interest to a single value (Figure 3.6). Consider the above example, but let us assume we are

only interested in x2.

If we express this inequality as , then if ,

. If ,

.

X3

X2

Positive profit

Negative profit

Page 57: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

41

Figure 3.6: value of variable of interest to result in profit

Therefore, whether we are interested in a single variable, or several variables, we can

use the mathematical model we have developed to determine a minimum level for the

factor(s) of interest to ensure the system would always be in a profitable state.

3.6. Failure Risk Analysis – An Empirical Study

The facts of the case study as they pertain to the quantification process are discussed

next. There is an automotive manufacturing company in Ontario, Canada, which has been in

business for at least the last six decades. This manufacturing plant has traditionally

produced either automobile engines or components for engines or transmissions. As a

result, the hourly workforce, hereafter referred to as operators, have either worked on the

engine assembly lines or have been involved with machining processes. The work on the

engine assembly line is manual, fast, and repetitive. Once the operator has mastered the

sequence of tasks, there is very little cognitive work or decision making involved. In

contrast, machining work is slower and much more cognitive. The majority of the operator’s

tasks involve periodically gauging the product at various stages of the process and making

offsets to the machines accordingly. The operator interfaces with the machine where tasks

such as calculating the amount of offset and entering the value can be highly cognitive.

X2

Positive profit when

Positive profit when

Page 58: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

42

Within this manufacturing plant, a new department has started its operations. We shall

refer to this department as Alpha and it will be the focal point of our analysis. This

department produces four types of precision gears, Driven, Drive, Pinion, and Ring, on four

independent machining lines. The flow of one machining line does not affect the other

three. The products are shipped to another plant to be assembled into vehicle

transmissions. The four gears produced at Alpha are among the most complicated gears of

the entire transmission; the manufacturing tolerances are extremely tight. To date, the

manufacturing plant’s bulk of activities have been in engine manufacturing; it has had very

little experience with gear manufacturing. Therefore, most operators have not had any

exposure to the general area of gear manufacturing in their previous years of experience

with the company.

All the machines in Alpha are new and are recent purchases. The focal point of our

analysis is one type of machines at alpha which we shall refer to as Kappa. There are four

Kappa machines, one for each product line. The Kappa machines are used to grind and hone

the surface of the gear teeth to the right dimensions, with extremely tight tolerances. The

four Kappa machines are almost identical; the only difference is the external tooling for the

gear they produce. These machines are far more complex than all other machines used in

the department. This complexity, along with the nearly 100% utilization rate of the Kappa

machines, translates to a much higher occurrence of breakdowns, compared to the other

types of machines used in Alpha. The Mean Time Between Failure (MTBF) of Kappa

machines is low, and their Mean Time To Repair (MTTR) is high. For example, for the Ring

machine, the MTBF and the MTTR are 132.3 and 7.3 hours, respectively. Given the fact that

Page 59: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

43

the normal production week consists of 120 hours of the Kappa machine running 24-hours-

per day Monday through Friday, and that each shift is 8 hours, the breakdown figures are

almost as bad as losing one shift each week.

Once the installation process was complete, staffing was started. The jobs were posted

and all operators within the plant were eligible to apply. No personnel selection process

was used due to union rules. The staffing was done gradually over several months as the

production output was ramping up. The plant currently employs about 3,000 hourly

employees. Due to layoffs over the last few years, the operator with the least amount of

seniority has about 18 years of experience with the company.

For the hourly workforce, the final staffing roster at Alpha consists of all-male personnel,

between 45 and 60 years old; all with at least 20 years with the company, but none with

any previous experience with the specific machines used in Alpha. The specifics of their

experiences within the plant differ: one or two operators have had some experience with

gear manufacturing; others have had no experience with gear manufacturing, but have

experiences with general machining processes; and finally some with no machining

experience at all, having worked only on engine assembly lines.

When the operators transferred in to Alpha from other departments within the

company, none received any machine-specific training at Alpha. There is an improperly-

implemented buddy-system (Swanson and Sawzin, 1975) in place whereby each new

operator enters, spends two weeks paired with one of the more experienced operators,

and observes his activities. There are no guidelines for the trainer and there is no

Page 60: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

44

certification for the trainee. After the initial two weeks, the novice operator is assigned his

own machine and is expected to learn the rest of the job on his own. This two-week

duration is constant for all new operators, regardless of their aptitude and learning rate.

There is no internal check on reaching a certain skill level prior to working on the machine

independently. This is an important point that will be discussed later in Section 3.6.6,

scenario 5.

In general, training in the manufacturing environment is more effective when a separate

training facility is available (Bluhm, 2001). In such a setting, the operators can learn by trial

and error without being afraid of causing damage or disruptions. At Alpha, the operators

get trained right on the production machines where the product demand is never reduced.

The operator performing the training is also operating the machine; this means the training

is second priority for the trainer as the expectation is still there for him to meet the

production quota. Furthermore, a “buddy system” is a weak form of training as there is no

guarantee that the trainer has the sufficient skill set and is good at transferring knowledge.

These reasons make the training at Alpha to be inferior to a structured on-the-job-training

(Sisson, 2001). Simply having the knowledge does not qualify one to properly train another.

The products are in high demand and the weekly production quota is high. This is good

news for Alpha as it has led to the department being fully staffed to run three shifts per day,

around the clock. On each shift, there are 4 operators to run the Kappa machines on each of

the four production lines. The operators are assigned to product lines; they do not switch

lines in the duration of our study. This set-up, along with very low personnel turn-over in

Page 61: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

45

the department, means that as time goes on, each operator becomes more familiar with

the product he produces as well as the particular Kappa machine to which he is assigned.

This is likely to lead to an increase in expertise.

One simplifying assumption we make in the case study analysis in this Chapter as well as

the subsequent ones is that all parts produced can be sold. Therefore, if improved operator

expertise results in lower machine breakdown and a higher production output, the system

gains additional revenue. We take a binary perspective where if a part is produced, it is of

acceptable quality. Parts that are of low quality are considered non-salable and are not

included in the production volume count. Therefore, in this sense, acceptable quality of

parts is incorporated into our data modeling. In some future work, we may consider

different counts for acceptable versus scrap parts produced by an operator and include this

as a factor in risk analysis or production forecasting.

Unlike the assignment to the product lines, the operators are not assigned to shifts.

There is a shift rotation on a weekly basis. Therefore, an operator who is on “Days” this

week will be on “Nights” next week and “Afternoons” the week after, before returning to

“Days” in three weeks. When the operators are on the afternoon or night shift, they may

experience inefficiencies due to disruptions in their Circadian rhythms. Circadian rhythm

refers to the fluctuations of one’s physiological conditions governed by the Earth’s day-and-

night cycle (Wickens et al., 2004).

Page 62: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

46

3.6.1. Quantification of skill and shift work

System experts at Alpha identify the following HR factors as the most significant:

operator skill and the effect of shift work, or the disruption on the operator’s Circadian

Rhythm.

1. Machine operator skill: This is the ability to operate the machine and to trouble-

shoot. We have decided to assess operator skill in terms of three components. To look

at expertise or skill as one variable may be an over-simplification, resulting in the loss

of some information. The study by Blau and Kahn (1996) expresses “skill” in terms of

two components, experience and education. Following their lead in breaking skill

down into components, we express operator skill using the following three

components:

a) Experience level: Similar to the model developed by Blau and Kahn, we adopt the

usage of “experience level” in our analysis. Familiarity with machining processes, in

general, and knowledge in gear manufacturing, in particular, are likely to assist an

operator with his daily activities.

b) Analytical skill: We indirectly adopt another term used by Blau and Kahn:

education. All operators in Alpha have a high-school diploma as the highest level of

formal education. Therefore, it is pointless to have education level as a

distinguishing factor. However, we adopt the idea behind formal education. A

positive result of education is the acquisition of technical or analytical skills. As

Page 63: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

47

these are normally gained through formal education, on-the-job training/learning,

we adopt “analytical” as the second component of skill.

While experience level and analytical skill are slightly related, analytical skill

refers to expertise on specific machines, whereas Experience measures familiarity

with the general gear manufacturing environment. Agrote et al, (1995) mention an

interesting study at Lockheed where even though 2,000 “green” employees were

put through a job-specific 4-week training program, the productivity suffered

greatly as a result of the inexperience of these employees. The authors go on to

mention the importance of experience in the general area, in addition to job-

specific training, when the tasks are complex.

c) Social interaction: Based on the study of Soller (2001), we can see a correlation

between overall skill and social skills in the form of interaction with peers and/or

supervisors. The interaction is referring to the willingness to ask for help from

others and to offer help to others. Operators can benefit when they share ideas

and give and receive help; help-giving can benefit even high achievers (Blumenfeld

et al., 1996) The correlation between operating the machine well and the

operator’s social skills makes intuitive sense. In a hypothetical scenario, two novice

operators have similar experience levels and initial analytical abilities. During the

first few weeks, they are both overwhelmed by their work and are unsure of many

of their machine operation tasks. One operator is shy and would try to

troubleshoot on his own. As such, the machine would likely remain down for a

longer period of time. The second operator is outgoing and can easily ask his more

Page 64: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

48

skilled colleagues for help. Effective collaboration with peers has proven itself a

successful and uniquely powerful learning method (Brown and Palincsar, 1989). It

is likely that the performance of the latter will be better, at least in the initial

stages.

2. Shift work: Various studies have established links between common shift-work

decision variables with physiological factors related to fatigue (Knauth, 1996; Kostreva

et al., 2002; and Hsie et al., 2009). The shift-work decision variables found to be the

most important in influencing fatigue and Circadian rhythm disruptions are shift

duration, starting time, direction of shift rotation, and distribution of days off.

Disrupted Circadian rhythm may affect operating and/or troubleshooting abilities of

machine operators, thus diminishing performance, increasing the risk of error, and

reducing the detection of anomalies.

It is interesting to note that in addition to the general expectation of the detrimental

effects of shift-work, when the operators of Alpha are interviewed, we find that many

actually enjoy working the night shift. This is because there is only one person around

from management; that person is the production supervisor. The work environment is

relaxed, and there are no interruptions of any kind. Therefore, in the specific context of

Alpha, there are advantages and disadvantages to shift work. The overall effects should

certainly be investigated.

We think of machine operator expertise as a score, ranging between 0 and 100 and

quantify it using CIT. Shift work, on the other hand, can be represented by binary variables.

Page 65: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

49

We can enquire whether or not an operator is working on a certain shift, and assign a “1”

for the affirmative answer, and “0” otherwise. In such cases, to differentiate between n

categories, we need n-1 binary variables. In the case of shift work, to capture the presence

of an operator on day-, afternoon-, or night-shift, we need two binary variables, x1 and x2,

where one can be set as baseline; x1= x2=0 represents the day shift where no adverse

effects of Circadian rhythm disruptions are felt by the operators.

In quantifying operator expertise, we use three tools, namely questionnaires,

observational studies, and self-tests. This method is similar to a study done by Glick et al.

(1986), on the relationship between job characteristics and three attitudinal outcomes:

effort, general satisfaction, and challenge satisfaction. They accomplished this by obtaining

reports from three separate data sources: interview incumbents, card sorts by job

incumbents, and observations by trained observers.

As our system experts, we used production supervisors and manufacturing engineers.

The CIT requires observers who are aware of the aims and objectives of a given job and who

see people perform the job on a frequent basis (Butterfield, 2005; Flanagan, 1954).

Production supervisors and manufacturing engineers make up the management level

closest to the operators and can, therefore, be considered to be experts intimately familiar

with this human-machine system.

A production supervisor and a manufacturing engineer filled out questionnaires covering

questions on performance and behaviour prediction. We used another production

supervisor and another manufacturing engineer to perform observational studies on

Page 66: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

50

operators’ technical skills as they relate to performance. During the observational study,

the expert spends two hours with each operator to observe his every move. The study takes

place when the machine is running and when there is an expectation of a tool change on

the machine. In this case, the expert is able to observe the operator during machine

operation, tool change, and product measurement. We use two different sources of

experts, manufacturing engineers and production supervisors to gain different perspectives

on performance and also to reduce the problem of same source variance. To avoid, or

reduce the problem of common method bias, we have used two sources and our data

collection is repeated twice within a month, performed at different times and locations.

Lastly, we asked the operators to fill out self-tests that cover three areas. The first one is

related to their experience level. It is important to know the knowledge background of the

individuals in terms of what they know about gear manufacturing. The second area of the

self-test is technical questions, asking very specific questions on the operation of the

machines, part quality, or tool change. The final part of the self-test covers some questions

regarding social interaction. These are meant to gauge the likelihood of the operator

seeking advice from others when stuck in an unfamiliar situation.

Operator assessment data include two experts filling out questionnaires, two experts

performing observational studies, and the operator himself taking a technical test. Each of

the expert assessments are repeated twice. Therefore, at each quarter, there are four

questionnaires, four observational studies, and one technical test for each operator. There

are three quarterly assessments done in the duration of our empirical study, making it a

Page 67: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

51

total of 27 assessment points per operator over the nine-month duration of the empirical

study. Given the total number of 12 operators who are assessed in the nine months, we

have a total of 324 operator assessment data points.

Questionnaires, observational studies and self-tests for this case study are designed

specifically for this application but we have followed all the guidelines discussed in Section

2.2. These include the questionnaire/study length, design of the questions, and the types of

responses.

We should note we do recognize that just because an operator has a higher skill level, it

does not necessarily translate into a higher productivity level (Bendoly and Prietula, 2008).

Operators may be unmotivated to apply themselves to the limits of their ability. From the

perspective of the operator, this motivation is needed as working close to the highest skill

level may not be preferable. However, skill level does affect productivity in two ways: 1) by

defining the maximum performance possible, thus moderating the relationship between

effort levels and actual performance, and 2) by moderating the relationship between

intrinsic motivation factors and effort level. There is work, such as Hancock (1986) in human

factors, and Bendoly and Prietula (2008) in operations management, that discuss an

inverted U relationship between the level of effort and the desire to apply the effort. The

inverted U shape is warranted by the combination of two effects: 1) stimulation provided by

a higher effort, avoiding boredom and monotony; and 2) consequence of a higher level of

effort, such as discomfort and fatigue. This combination of positive and negative effects

enables us to find the most preferred level of effort at the top of the inverted U shape.

Page 68: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

52

Bendoly and Prietula (2008) state that higher skill levels can affect the inverted U

relationship by shifting up the most desirable effort level. As an operator’s skill is improved,

the most desirable effort level increases.

We are not analyzing skill in combination with the important factor of motivation. In

Section 3.7, we state motivation to be an HR factor to consider in our future works. Despite

this fact, since we have established that higher skill positively affects operator productivity,

our discussions are still valid. Our emphasis on training and higher skill has a positive

contribution to the productivity of a human-machine system.

3.6.2. Analyzing the risk of failure

A set of operators with varying skill levels and physical tolerances to disruptions caused

by shift-work have been assigned to a set of machines. We aim to model the performance

of these machines, given the effect of skill and shift work. We perform a failure risk analysis

on the machines using the PHM. The first step is to determine if any human-related

covariates are significant and are added to the PHM. If there are significant HR covariates,

we proceed to ascertain the managerial impact. We consider two intervention measures

the manger may take in this context and develop a revenue model that provides a cost-

benefit analysis for each intervention measure.

In developing the PHM, we use a software tool, called EXAKT, designed specifically for

proportional hazards modeling in industrial settings (Jardine et al., 1997; Jardine and

Banjevic, 2005). This software is developed by the Centre for Maintenance Optimization

Page 69: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

53

and Reliability Engineering (C-MORE) to model the data and determine the significance of

the covariates in the development of a PHM.

We want to investigate the important question of whether the HR factors identified by

the experts play a major role in Alpha’s equipment MTBF. The initial observations seem to

indicate a significant shift-to-shift difference in terms of the frequency and duration of

machine failures, resulting in a difference in production output. In addition, some operators

seem to operate the machines better than others. These operators are more in tune with

the machine and can detect abnormalities through sound and/or product measurement

trends. The results of our preliminary analysis on the failure data for the Ring line Kappa

machine and the three operators assigned to it during the first three months of operation

are shown in Table 3.1. There seems to be shift-to-shift and operator-to-operator

differences when we compare the number of failures to the number of shifts worked. There

is quite a difference among the three operators working on this machine. The worst

operator has had three times as many failures as the best one (9 compared to 3). In terms

of shift-to-shift differences, two shifts are identical and much higher than the third shift (7

failures compared to 4). The sample size is small at this point; however, there is evidence to

warrant a closer investigation using more data.

Page 70: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

54

Shift

Total Percentage 1 2 3

Operator

1 22 , 1 * 20 , 0 20 , 2 62 , 3 4.8

2 19 , 3 22 , 1 21 , 2 62 , 6 9.7

3 21 , 3 20 , 3 22 , 3 63 , 9 14.3

Total 62 , 7 62 , 4 63 , 7 * Total number of shifts, followed by failure occurrences

Percentage 11.3 6.4 11.1

Table 3.1: Initial Shift-to-Shift and Operator-to-Operator Differences in Failure Occurrences

Based on the historical data and failure times, we take the initial step of obtaining an

age-based hazard function for the Kappa machines. We analyze the first three months of

data, where there are 187 shifts, 18 failures, and 2 calendar (also called administrative)

suspensions (one due to a production stoppage, and another one at the end of the data

collection period). The full dataset, including the segment used for this initial analysis, as

well as the upcoming analysis for the rest of Chapter 3, and subsequent Chapters, appear in

Appendix G. We also provide a description of the various usages of the data set for the

various parts of this dissertation.

Failure is a consequence of any operator-related mistake that takes the machine out of

production. The failure mode is not captured beyond this. We treat the product line as a

control variable. For the sake of simplicity, we are showing the model for only one of

Alpha’s four product lines, Ring gear. This will sufficiently demonstrate the points we are

making about the inclusion of HR factors into the PHM and the managerial impact

discussion that follow.

The initial model focuses on the age of the machine and contains no covariates:

Page 71: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

55

0239.0

1505.123505.123

9761.0)(

tth , (9)

where time is measured in hours since the last restart. The working age of the machine is

reset to zero after each failure that takes the machine out of production.

The shape parameter, β, in Eq. (9) is not significantly different from 1. This is confirmed

by β’s p-value as shown in Table 3.2. This, in turn, tells us the machine is currently

experiencing a constant hazard rate. Therefore, there is no evidence that the failures are

age-dependent. We are analyzing a complex system as a whole and not its individual

components. Since there can be many failure modes playing a role, assuming there are no

dominant ones, then any single repair, or component replacement, will not affect the

failure risk of the entire system.

The other parameter of this age-based model, the scale parameter, η, provides the

characteristic life of about 123 hours for this machine. This is the time at which the

probability of failure is 63.2%.

Parameter Estimate Standard Error P-Value

Scale 123.505 31.54

Shape 0.9761 0.1763 0.89

Hypothesis: Shape parameter = 1 not rejected, based on 5% significance level

Table 3.2: Summary of Estimated Parameters

The above analysis is repeated in EXAKT while fixing the value of Beta to 1. In this case,

hours and this value represents the mean life of the machine.

Since the failures are random and follow an exponential distribution, in the absence of

covariates, the model cannot help with decision making. We will enhance this analysis by

Page 72: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

56

developing the hazard function into a PHM that fits the data (verified by a significant K-S

test) and with significant covariates. We will then be in a position to use the PHM to predict

failures, aiding the decision maker in enhancing maintenance activities.

We analyze the historical failure data captured for all shifts over the last nine months

on a per-machine basis. Data from various Kappa machines are not aggregated and are for

the Ring gear line only. However, to investigate the effect of HR factors, we aggregate

same-machine data for the assigned operators. The data set has 966 records, consisting of

53 failures and 4 calendar suspensions. We match the shift failure data and whether or not

the failure was human-related, with operator attendance records. As a result, the data

include the working age of the machine, whether or not a failure occurred, a shift identifier,

and the operator’s three skill component scores.

The first two input variables as potential covariates are the two indicator variables,

(X1,X2), representing the three shifts. Day shift, afternoon shift, and night shift are

represented by (0,0), (1,0), and (0,1) respectively. The next three input variables, X3, X4, and

X5, are the three skill components. Each is expressed as a score out of 100. Skill assessments

are repeated quarterly, or three times in our nine-month analysis of Alpha, resulting in nine

sets of skill components. For informational purposes, univariate and bivariate statistics on

the three skill components, including mean and standard deviation, as well as correlations,

are shown in Table 3.3. The values for the skill components shown here are for the same

three operators whose skill values were shown in Table 3.1. However, the values in Table

Page 73: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

57

3.3 are based on the entire nine months of analysis, and are therefore based on 9 sets of

data points for each operator.

Skill Component

Variable Min Max Mean Standard Deviation

Pearson Correlation Coefficient

Analytical Experience Social

Analytical X3 19.76 95.85 66.15 23.65 1.00 0.85 -0.33

Experience X4 14.29 100.00 69.72 27.50 1.00 -0.71

Social X5 53.13 88.54 66.05 12.33 1.00

Table 3.3: Univariate and Bivariate Statistics on Skill Components

We make a simplifying assumption for the skill scores. We acknowledge that each skill

score is unlikely to remain static over time and that on-the-job-learning is likely to have a

positive effect on operator expertise. We assume this dynamic nature to occur in a step-

wise manner. The planning horizon we consider is short-term (one 8-hour shift) and as a

result, assuming a constant skill score throughout the planning horizon makes sense.

Therefore, the covariates in the PHM for this interval are time-independent.

In addition to the five aforementioned variables, we also consider the pair-wise

interactions. There are no established guidelines for the levels of interactions to consider;

however, in our case, considering pair wise interaction is sufficient, in terms of interpreting

the results. In general, when it comes to considering interactions beyond the second level,

the analyst should make a judgement based on the context. If higher level interactions

make intuitive sense, they should be considered as well.

To develop our model, we primarily use the backward selection method but complement

this method with the Akaike’s Information Criterion (AIC) to help avoid bias in the model

fitting process (Burnham and Anderson, 2004). It provides the analyst with a trade-off

between accuracy and complexity. The AIC score for each model uses the maximum

Page 74: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

58

likelihood estimator for the model as well as the number of variables included in the final

model. At each stage, we consider all variables with p-values above 0.1 to be potential

candidates for elimination. Initially, this results in a large decision tree. But many of the

possible branches end with the same set of variables. At the end of the process, of the six

unique models, we choose the one with the lowest AIC.

We obtain the following PHM:

, (10)

where t is measured in hours. Table 3.4 shows the EXAKT output resulting in the PHM

estimation.

Covariate Parameter Estimate Standard Error P-Value

- Scale, η 119.3 60.8

- Shape, β 1.043 0.106 0.69

z1= X2 1 21.35 6.225 < 0.01

z2= X1 X5 2 0.0366 0.0093 < 0.01

z3= X3 X4 3 -0.0003 0.0001 < 0.01

z4= X2 X3 4 0.222 0.0574 < 0.01

z5= X2 X4 5 -0.227 0.060 < 0.01

z6= X2 X5 6 -0.291 0.0845 < 0.01

Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level

Table 3.4: Summary of Estimated Parameters for Ring gear

In working with the variables, we have not standardized them and therefore, they take

on values of differing magnitudes. For example, z1 represents the night shift and takes on a

value of 1, when we are analyzing the night shift. In contrast, z3 represents the interaction

of two skill components, both of which are expressed as a score out of 100. The coefficients

of z1 and z3 in turn reflect the different magnitudes and compensate for them.

Page 75: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

59

A simple hazard function, such as the one expressed earlier in Eq. (9), fits the data set

(confirmed by a non-rejection of a Kolmogorov-Smirnov (K-S) test, p-value 0.41). But since

the failures are random and follow an exponential distribution, in the absence of covariates,

the model cannot help with decision making. We are able to enhance this analysis by using

the PHM to predict failures, aiding the decision maker in enhancing maintenance activities.

The PHM obtained has significant covariates and fits the data. This statement about the

model fit is based on the non-rejection of a K-S test, p-value 0.08. This p-value is greater

than the reject threshold of 5% but obviously we would prefer to see it in the higher ranges.

In the discussions that follow in Sections 3.6.6 and 3.6.6.3 and the equations obtained,

equations 14, 15, and 16, the p-values are all 0.2 or greater. The same decision-making

process followed in this Section is followed in those two Sections, but based on a larger

data set, with models that have a better fit, according to K-S tests, with higher p-values.

As general commentary on the usage of K-S test for model fit, we should say this is the

most common test done for the PHM. In using EXAKT to build the model, the software uses

the Wald test to test if the parameters are significant. It uses the K-S test to check for model

fit for the entire model. This usage of the K-S test is combined with the Cox-Snell residuals

test. Another possible test that may be used is as an alternative to the K-S test is the

Schoenfeld residuals test (Crowder, 2012).

The validity of the failure prediction by the obtained PHM can be confirmed by using

another tool, for example logistic regression, to achieve the same goal of predicting the risk

of machine failure. This is discussed in more detail in Appendix B; the highlight of this

Page 76: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

60

discussion is that the two methods generally prompt the DM to take similar actions in the

face of operator-related risk.

3.6.3. Odds estimates

Odds estimates are calculated from the ratio of the occurrence of an event to its non-

occurrence,

. In our context, odds estimates can provide a DM with an

idea of the degree of labour intensiveness, or the level of automation, of the system. A

system that is highly automated will likely experience a low impact of human-related risk;

therefore, small odds ratios point to the fact that HR factors would likely not play a

significant role on system performance. In such an automated system where MR factors are

the dominant ones, financial resources may be better spent on capital expenditures than

human resource initiatives, such as training programs. The DM will not have to have a large

focus on mitigating the risk stemming from the operators. Therefore, the odds ratios can

serve as a good initial point.

The PHM developed for the system under analysis can be used to calculate the odds for

an operator, given a set of factors, such as a specific shift and a certain machine age. The

PHM is used to calculate the odds for the second operator using the same set of factors but

changing the specific HR factor being analyzed, such as the skill of the second operator. The

odds ratio can then be obtained by calculating the ratio of the odds of the two operators.

Odds estimates are calculated for Alpha to provide the DM with an idea on the role of

operator skill as well as the effect of disrupted Circadian Rhythm (shift-work). The odds

Page 77: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

61

estimates are given in Tables 3.5 and 3.6. Considering the dayshift and a machine age of 40

hours, we compare operators with highest and lowest skill components. The corresponding

odds ratio, comparison of the odds of the two operators, is 21. This points to a large

difference where the worst operator is 21 times more likely to cause a failure compared to

the best operator. In turn, this points to the large role operator skill can play in this system.

It can also point to possible large gains that may be realized as a result of operator training.

Condition Odds Odds Ratio

Worst operator Best Operator

Worst operator Skill components: (20,14,73) 0.063 1 21

Best Operator Skill components: (96,100,53) 0.003 1

Table 3.5: Odds Estimates

Using the mean skill component values in Table 3.2, we consider the average operator

and calculate odds estimates across three shifts. We keep machine working age constant at

40 hours. Odds ratio between afternoon and day shift is about 12. This value drops to 5 and

2.5 when we compare afternoon- to night-shift, and night- to day-shift, respectively. The

odds and the odds ratios are shown in Table 3.6.

Condition Odds Odds Ratio

Day shift Afternoon shift Night shift

Average operator Day shift 0.015 1

Average operator Afternoon shift 0.183 12.2 1

Average operator Night shift 0.038 2.5 0.21 1

Table 3.6: Odds Estimates

On a macro level, the information above provides the DM with some valuable

information. The first is the potential large gain in training programs. The large odds ratio of

Page 78: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

62

0.063/0.003=21 between the worst and the best operator can present itself as an

opportunity to reduce the risk in the system significantly by simple training programs

targeted towards operators at the lower end of the skill spectrum. Another benefit a DM

may gain from the above information is for preliminary planning purposes. The DM may

decide to assign additional maintenance people, or the more skilled maintenance staff, to

the afternoon shift to deal with the additional risk that seems to be present on this shift

compared to the other two. Overall, the large odds ratio for both operator skill and the

effect of shift work presents the DM with the knowledge that HR factors may play a

significant role on system performance.

3.6.4. Evaluation model for intervention methods

We propose two intervention methods for Alpha: reduction of the production rate and

addition of a highly-knowledgeable person (hereafter referred to as a Guide) on shift. An

example of such a Guide would be a company representative for the machine

manufacturer. The two aforementioned intervention methods are not the only HR

intervention methods available to a DM. We use these two approaches as examples

because they may be applicable to many systems including the considered case study.

Given these two intervention methods, the possible scenarios for running the machine are:

(1) running the system at the regular production rate with no Guide; (2) running the system

at a fraction of the regular production rate to provide the operator with a longer decision

time; (3) adding a Guide to assist in proper task completion by the operator; and (4) using

both approaches together, adding a Guide and running the system at a reduced rate.

Page 79: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

63

Each intervention method considered by the DM has a decision variable associated with

it in the mathematical formulation. As an example, the decision variable associated with

adding a Guide is a binary variable that will be equal to 1 when the Guide is present and

zero otherwise. Each value of the decision variable will affect expected uptime and

probability of failure differently.

We follow Eq. (5) to obtain the expected value of Alpha’s revenue function. The terms of

Eq. (5) are updated and defined as follows for Alpha:

: a binary variable to determine whether or not we change the production rate;

when production rate is unchanged.

: a binary variable to determine whether or not a Guide is present

: production rate, ranging between 0, stopped machine, and 1, machine running at regular

rate of 100%

: production volume per hour

P: profit per part

, where is the cost of having a Guide on a shift.

For the net revenue part of the function, the expected uptime hours are multiplied by

the production-per-hour of the operator, the profit-per-part, and the production rate.

Production-per-hour is calculated as a moving average of the hourly production of the last

Page 80: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

64

M-shifts of a particular operator. For the second part of the function, an operator’s average

failure cost, CF, is calculated by taking the total cost of all HR failure modes that occur

during this operator’s last M shifts, divided by the total number of failure occurrences

during the same period. The third part of the function is the additional cost of a Guide per

shift.

The ψ function associated with Alpha’s revenue function is as follows:

. (11)

Explaining the terms related to : Terms are taken directly from the decision variable

matrix. The two terms and represent the effects of the second decision variable,

the addition of a Guide, on the third and fourth covariates. The other covariates are not

affected and the reasons will be discussed shortly. The term is a function of two

variables: . The term is previously defined. The next variable, ,

measures the effect of the presence of a Guide on an operator’s skill. When there is a Guide

available for the operator to consult with, the operator is likely to commit fewer mistakes.

This has the equivalent effect of the operator having a higher analytical skill score. The

presence of a Guide will not affect the experience level or social interaction components of

skill. As a result, only covariates and , which are functions of Analytical skill, are

multiplied by . One way to determine the correct value of k is to have an initial value

based on expert knowledge and then refine this value with more data. Shifts with and

without Guides can be compared to determine the reduction in hazard rate. This can be

equated to a gain in analytical skill if all else remains fixed.

Page 81: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

65

Explaining the terms related to : The ψ-function in Eq. (11) is also dependent on ;

for Alpha, this decision variable is a function of production rate, r, and has the form

. The rate, r, ranges between zero, for a stopped machine, and 1, for running the

machine at the regular rate of 100%. When we consider Eq. (3), the form of ensures the

overall hazard rate is zero if . A stopped machine has no risk of failure. The value for

determines the effect of the reduction in hazard rate when the production rate is reduced.

Empirical evidence can be used to determine . To this end, various production rates can be

implemented while keeping other factors constant. As this approach may be impractical,

expensive, or difficult, we may expert knowledge elicitation, similar to the approach

proposed by Zuashkiani et al., (2009). In this case, a rate is proposed, such as 50%, and the

corresponding reduction in hazard rate is estimated by the experts. The hazard reduction

value is then used to calculate the correct value associated with .

Alpha’s general revenue function is expressed in terms of probabilities, expected values,

and the specifics of the PHM obtained. In the following numerical example, we analyze the

two modes of HR-interventions; they consist of the combinations of presence or absence of

a Guide ( or , respectively) and running production at full-speed, partial-

speed, or not running at all ( , , and , respectively). This is presented in

Table 3.7. The scenario where we stop the machine and add a Guide is not considered as it

would never be the optimal course of action. We calculate the revenue for each of the

remaining five cases and choose the one with maximum revenue as the optimal course of

action.

Page 82: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

66

Full production rate,

r = 1

Half production rate,

r = 0.5

Stop machine,

r = 0

Add Guide, √ √ √

Do not add Guide, √ √ X

Table 3.7: Combinations of intervention methods to be considered

Various hypothetical sets of inputs are considered in order to show the possibility of

different optimal solutions. All scenarios take place on the afternoon shift, {X1=1, X2=0}. The

machine’s working age is taken to be t=40 hours, and the analysis is done for an 8-hour

shift. The effect of adding a Guide is equivalent to a 50% rise in analytical skill, i.e. k=0.5.

The operator’s production count and failure cost are calculated based on the last 20 shifts

worked.

Scenario 1: A novice operator with low analytical skill of 40 is scheduled to operate the

machine. Failure cost is calculated to be an average of $4,000 per incident; the operator’s

average hourly production is 33; the cost of adding a Guide is $300 for one shift; profit per

piece is $40; and a 50% reduction in production rate results in a 15% reduction in hazard

rate. As can be seen from Figure 3.7, the optimal course of action is to add a Guide to help

the operator.

Scenario 2: the optimal course of action changes when the cost associated with failure is

halved from $4,000 to $2,000 and the cost of adding a Guide is doubled to $600 for a shift.

All other factors are kept the same as scenario 1. As can be seen in Figure 3.7, the optimal

course of action is to accept the risk and run at normal rate, with no Guide.

Page 83: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

67

Scenario 3: Most input parameters are kept the same as scenario 1. The difference is the

presence of an operator with the high analytical score of 90 on shift. His average production

volume is 38 pieces per hour. Contrary to scenario 1, the optimal course of action is to run

at normal rate and add no Guides (Figure 3.7). This scenario shows us the difference a

skilled operator makes. All conditions are kept constant except Analytical skill which has

changed from 40 to 90. In this case, we need no Guide, thus saving $300 for the shift.

Scenario 4: Differences from scenario 1 are two-fold. Firstly, the product has a low

profit-per-piece of $5. Secondly, a 10% production rate reduction results in a 40% reduction

of hazard rate. In this scenario, the optimal course of action is to run at the partial rate of

90% (Figure 3.7).

Scenario 1: Optimal Strategy: Add a Guide Scenario 2: Optimal Strategy: Run as Normal

Scenario 3: Optimal Strategy: Run as Normal Scenario 4: Optimal Strategy: Reduced Production Rate

Figure 3.7: Courses of action under different scenarios

9199

4495

0

4512

9447

0

3200

6400

9600

(1,0) (0.5,0) (0,0) (0.5,1) (1,1)

Rev

enu

e

Values of r, rate, and g, Guide: (r,g)

9442

4702

0

4337

9294

0

3200

6400

9600

(1,0) (0.5,0) (0,0) (0.5,1) (1,1)

Rev

enu

e

Values of r, rate, and g, Guide: (r,g)

11064

5475

0

5288

10984

0

3700

7400

11100

(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1)

Rev

enu

e

Values of r, rate, and g, Guide: (r,g)

724 790

0

651 661

0

280

560

840

(1, 0) (0.9, 0) (0, 0) (0.9, 1) (1, 1)

Rev

enu

e

Values of r, rate, and g, Guide: (r,g)

Page 84: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

68

Scenario 5: This is an exaggerated case where failure cost is portrayed so high

($120,000) as to make all options negative. All other input factors remain the same as

scenario 1. The optimal decision is to take the machine out of production for a shift until a

more skilled operator is available (Figure 3.8).

Figure 3.8: Optimal Strategy: Stop Machine

This scenario complements the discussions in Section 3.5. We are in a situation that the

operator skill is so low that the best course of action is to shut down the machine until a

more skilled operator becomes available. This may not happen until the next shift when the

next operator comes in for duty. This translates into a lot of expensive capital expenditure

sitting idle. Fixed costs are still accruing while we are not making any revenue.

As mentioned in the case description, Alpha’s current departmental policy is for a new

operator to be trained for two weeks prior to independently operating a machine. This

arbitrary duration of two weeks has no scientific basis. It is unrealistic to expect one

particular training duration to be appropriate for everyone, regardless of the trainer or the

trainee. It may be more practical to expect the operator to possess a minimum skill set

before being assigned to a machine on his own. This minimum skill level may come from

-9777 -11682

0

-5201

-2010

-11700

-7800

-3900

0

(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1) R

even

ue

Values of r, rate, and g, Guide: (r,g)

Page 85: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

69

our mathematical model developed in Section 3.5 so that the company will never run into a

situation such as scenario 5 discussed above. Operator risk should never be high enough to

justify a stopped machine as the best course of action.

Finding the profit threshold

The following analysis comes from Eq. (6), developed in Section 3.5:

For the two cases (r,g)=(1,0) and (r,g)=(0.5,0), this equation is revised to the following:

The intervention measures only include production rate changes but not the addition

of a Guide. Therefore, there are no direct intervention costs, .

For the remaining two cases (r,g) = (0.5,1) and (r,g) = (1,1), the equation is revised to

the following:

, (12)

The intervention measures include changes to the production rate as well as the

addition of a Guide.

We use the values provided in Scenario 5 to calculate the profit threshold for each of the

four cases:

Case 1: (r,g) = (1,0)

Page 86: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

70

The integral has no closed form and as a result, we will use numerical methods to solve

for . We know that due to the monotonic nature of the function, there is only one

unique solution for that will make the whole function equal to zero. Table 3.8

displays the step-wise values used for and the resulting value for the function. We

use a simple bracketing technique where we find the root in the interval (a, b) if f(a)

and f(b) have opposite signs. We change the values until we get to within three decimal

points of zero. The end result is .

Value of Value of function

0.9 -35

0.1 0.56

0.2 -4.6

0.15 -2

0.12 -0.5

0.11 0.04

0.112 -0.06

0.111 -0.01

0.1105 0.016

0.1107 0.005

0.1108 0.0004

0.11079 0.001

0.11081 -0.00007

Table 3.8: Values of and the corresponding function value

Page 87: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

71

For the remaining three cases, the only terms of Eq. (11) that differ from Case 1 are the

failure cost,

, and the intervention cost,

.

Case 2: (r,g)=(0.5,0)

and

.

We follow the same numerical methods procedure described for case 1 and obtain

.

Case 3: (r,g)=(0.5,1)

and

.

By using numerical methods,

Case 4: (r,g)=(1,1)

and

.

By using numerical methods,

Table 3.9 summarizes the values of for the four cases. Value of represents the

maximum value our function can take before we get into a “loss” scenario. Therefore,

when all else is equal, case 1 represents our best hope for running profitable since it allows

the largest amount of risk to be present. In a way, case 1 provides the most flexibility.

Page 88: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

72

Case (r,g) Value of

(1,0) 0.11080

(0.5,0) 0.05739

(0.5,1) 0.02143

(1,1) 0.04087

Table 3.9: Value of for the four cases

Second challenge: finding the minimum skill level

We have established in Section 3.5 that we can have a positive net revenue when

. In our context, we are interested in an operator’s skill set, consisting of all

three of his skill components, that satisfies the inequality , where the values

for come from those calculated for the four cases above. In the context of our aim to

determine the minimum operator skill set, only case 1 needs to be analyzed further. This is

because case 1 has the highest value for . Therefore, the minimum skill set required for

the other cases will have to be even higher than this case in order to comply with the lower

risk level allowed under these cases.

We use the specific model developed for Alpha:

.

The covariates of interest are those that contain the variables analytical skill, ,

experience level, and , social interaction. The covariates that contain these three

variables are displayed in Table 3.10. In scenario 5, we consider the afternoon shift.

Therefore, and .

Page 89: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

73

Covariate Covariate definition

Table 3.10: original variables contained in covariates

Case 1: (r,g)=(1,0) and .

We take the natural logarithm of both sides; and since , we have

, (13)

Only those combinations of skill components, ( that will satisfy the above

inequality will result in positive net revenue. Otherwise, the DM should choose to stop

the process to avoid an expected loss. This is displayed graphically in Figure 3.9.

Figure 3.9: Minimum skill set required to achieve positive net revenue

X3 X4

X5

Positive net revenue

No revenue 6875

Page 90: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

74

Consider the case of novice operators. When an operator first transfers into the

department, his “experience” score is beyond the control of the DM. It is only dependent

on how much previous exposure the operator has had in the general area of gear

manufacturing. This leaves the DM with two variables, Analytical skill and Social Interaction.

If this particular operator is introverted and is assessed low on his social interaction skill,

then the DM should make sure the operator gets additional technical training to ensure a

high Analytical skill score to offset the other two low scores. The training needs of this

operator would be prioritized over another novice operator who is assessed high on his

social interaction score.

This is a demonstration of the practical aspect of the models we have developed. In

this example, the inclusion of HR factors is shown to directly affect decisions to ensure

profitability of the system in the short term.

3.6.5. Expanded data set, additional factors

In the analysis thus far, we only looked at one of the machining lines within Alpha and

considered the operator skill components as well as shift work. Our data set came from one

machine and included 966 records and 57 events. We can increase the size of our data set

by combining three machining lines. This increases the data set to 3049 records and 130

events. The additional data records reduce the variability in the data. There can be

theoretical disadvantages in increasing the size of the dataset. But this can only be

computational and applicable to extremely large datasets, which is not the case with ours.

Page 91: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

75

Each machine is treated as a control variable as we use two binary variables to represent

the three machines. In addition, we consider an additional factor: day of the week. We

divide the week into three segments, Monday, Friday, and the rest, and represent these

three with two binary variables. Our thought process here is that the first day may be less

productive and more error-prone than the rest of the week because the operators are just

returning from the weekend and have not gotten into full rhythm yet. This is in line with the

work performed by Williams (2004) where he finds the average time worked per person is

very similar on Tuesdays, Wednesdays, and Thursdays, but lower on Mondays, and lowest

on Fridays. This may be especially true in the case of those operators whose turn it is to

work on the night shift. Whilst the first night can result in the greatest impairment in

performance (Lamond et al, 2004), adaptation of sleep and performance can occur as the

week progresses.

Our thought process for considering the last day of the week on its own is that as the

week goes on, fatigue builds up and by the time an operator gets to the last day of the

week, he/she is more tired and consequently, less effective/productive/alert. The other

thought would have been that the operator on the night shift would get adjusted to the

night shift hours and perform better as the week goes on. Therefore, the nightshift

performance on Fridays would be better than on Mondays.

The combinations of the night shift and the first or last shift of the week are examples of

interesting two-way interaction terms that are considered in this expanded analysis. The

resulting model is as follows:

Page 92: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

76

, (14)

where t is measured in hours. The variables X1 and X2 are representing the three shifts; V1

and V2 are representing the beginning, middle, and the end of the week; and finally, Y1 and

Y2 represent the three machines. Table 3.11 adds further detail to Eq. (14).

Covariate Parameter Estimate Standard Error P-Value

- Scale, η 0.063 0.14

- Shape, β 1.086 0.08 0.27

z1= Social 1 -0.128 0.04 < 0.01

z2= Analytical 2 -0.138 0.05 < 0.01

z3= X1 3 2.485 0.47 < 0.01

z4= X2 4 1.427 0.48 < 0.01

z5= Y1 5 -1.561 0.34 < 0.01

z6= Y2 6 -1.813 0.33 < 0.01

z7= V1 7 6.402 1.56 < 0.01

z8= Social Analytical 8 0.0017 0.0007 0.014

z9= X2 V1 9 1.77 0.53 < 0.01

z10= Social V1 10 -0.1009 0.0257 < 0.01

Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level

Table 3.11: Summary of Estimated Parameters

This model adequately represents the data set; when we perform a K-S test on the

model fit, the p-value is 0.2 and as a result, the model fit is not rejected. This model has the

lowest AIC compared to any other models obtained using the same set of main effects and

two-way interactions.

Given the fact that there may be high correlation between some of the variables, such as

that between experience and analytical skill, how do we ensure we do not have a

multicollinearity problem? Once again, as explained with the significance testing of the

individual variables, this is something that is dealt with by EXAKT during the model building

Page 93: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

77

process. Two variables may be highly correlated and as a result neither may be significant.

Each one on its own is significant and this would then prompt a judgement call on which

variable to exclude. This was the case with Eq. (14) where Experience and Analytical skill

could not be used in the model together as they were highly correlated. We had seen early

signs of this when we had looked at the data for the Ring line (Table 3.3). As we will discuss

next, only one of these two variables could remain in the PHM.

3.6.5.1. Discussion on the model’s terms

It is necessary to interpret the presence or absence of the terms in equations 14. Among

the nine main effects considered, seven are present and two are not:

o Experience: this variable is absent in the model. It is highly correlated with

Analytical. The partial correlation coefficient is 0.88; the effect of the third variable,

Social, is controlled for. The model can work by including either Experience or

Analytical, but not both. When both are included, neither is found to be significant.

Therefore, we use our expert judgement to include Analytical and exclude

Experience. We believe Analytical to be the more important variable. As an operator

becomes more experienced, he typically gains more knowledge, translating to

Analytical skill, or the ability to operate the machine. This is the general belief

according to the works in the literature (Ash and Levine, 1985; Quinones et al.,

1995). But this is not always the case, or it may have diminishing effects over time.

More experience does not always result in higher abilities. There are some early

studies that conclude work experience not to be as successful for job performance

Page 94: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

78

as previously thought (Fiedler, 1970). In a study by Hunter and Hunter (1984), a

correlation factor of 0.18 was found between work experience and job performance.

o Social: This variable is not highly correlated with Analytical or Experience. The partial

correlation coefficients are 0.572 and -0.498 with Analytical and Experience,

respectively. Therefore, its presence in the model is not surprising. The covariate has

a negative coefficient, 1, which makes intuitive sense: the higher the social skill, the

less the risk posed.

o Analytical: This variable is found to be significant with a negative coefficient. This

makes intuitive sense: the higher the analytical skill, the less the risk posed.

Therefore, training programs can reduce the risk posed by operator error or inability

to operate machine.

o X1 and X2: These are the two binary variables representing afternoon and night shift.

They are both significant and both have positive coefficients. This indicates both to

be worse (riskier) than day shift. Interestingly, afternoon shift poses more risk than

night shift. As mentioned in Section 3.6.1, when the operators of Alpha have been

interviewed, many actually enjoy working the night shift due to the fact that there

are hardly any members of management present. The work environment is quite

relaxed and there are no interruptions of any kind. The better working conditions

(from the perspective of the operators) may offset some of the adverse affects of

the disrupted circadian rhythm.

Page 95: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

79

o Y1 and Y2: These are the two binary variables representing the Driven and the Drive

machines. They are both significant and both have negative coefficients. Therefore,

the Driven and Drive lines run better than the Ring line.

o V1: This is a binary variable representing “first day of the week or not”. It is found to

be significant, with a positive coefficient. The first day of the week is worse than the

rest of it. An operator’s productivity may be reduced on Mondays as he/she needs

to be reoriented after two days away from the work process. Another possibility is

that operators may lack motivation on a Monday because this day is furthest from

the next available day of rest or leisure (Bryson and Forth, 2007).

o V2: This binary variable, representing “last day of the week or not” is absent in the

model. The original thought behind including this variable was that as the week goes

on, fatigue accumulates and by the last day of the week, the operator is more tired

and consequently, less productive or alert. An opposing thought would have been

the operator on the night shift gets adjusted to the night shift hours and performs

better as the week goes on. Therefore, the nightshift performance on Fridays would

be better than Mondays. Neither one of these thoughts seems to apply to Alpha.

3.6.5.2. Procedure for developing the model

In working with EXAKT to select a final PHM, there are numerous ways one could go

about creating a model. Our first attempt was similar to the approach we took earlier in

Section 3.6.3. We started with 35 variables (9 main effects and 26 two-way interaction

terms) and used the backward selection model for the most part. This approach normally

Page 96: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

80

involves an iterative process where, at each stage, we eliminate the variable with the

largest p-value for its coefficient. We continued this process until we were left with only

significant variables. However, we combined this backward selection process with AIC

where we considered a few different paths and selected the model with the lowest AIC. The

best model we obtained from this approach had 14 covariates and an AIC of 36.4.

In another attempt, we considered grouping the variables as our initial step. We started

analyzing each group on its own, and followed a backward selection process to determine

the significant variables within the group. We then took the significant variables from each

group and combined them into a large list. But this large list was smaller than the 35

variables that we had begun with in the previously described approach. The best model we

were able to obtain from this approach was an improvement over the last model; it has 13

covariates and an AIC of 33.4. Step-by-step details of this procedure are provided in

Appendix C.

The final approach used to obtain the model represented by Eq. (14) involved the

forward selection method. Each main effect was considered on its own first and only one,

V2, is found to be not significant. Experience is found to be significant; however, it is highly

correlated with Analytical and both cannot be used in the model together. For the reasons

mentioned above in Section 3.6.5.1, it is excluded from the model-building process. The

interaction terms that did not involve the two eliminated main effects were then

considered on their own and 6 out of 9 were found to be significant. The ones found to be

not significant were Analytical-X1, Analytical-X2, and Analytical-V1. We then used our expert

Page 97: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

81

knowledge of the system to decide on the order of importance of these remaining

variables. We used a forward selection model to add the six interaction terms one at a time,

starting with the most important one, considered to be “Social-Analytical”. Three out of the

six interaction terms were found to be significant in the final model. The resulting model is

an improvement over the last model obtained; it has 10 covariates and an AIC of 24.3. The

aforementioned methods are summarized in Table 3.12.

Method Used Number of

Covariates

AIC

Start with all main effects and pair wise interactions. Use backward selection method,

complemented with AIC

14 36.4

Place variables into groups of similarity; within each group, perform a backward

selection method, complemented with AIC. Combine all selected variables from each

group and use backward selection method, complemented with AIC.

13 33.4

Consider each main effect on its own first. From the main effects that are found

significant, consider the pair wise interaction terms individually. Prioritize the main

effects and interaction terms and follow a forward selection method, starting with the

most important main effect and ending with the least important interaction term.

10 24.3

Table 3.12: Model Selection Method and Results

As can be seen, we have gone beyond the method described earlier in Section 3.6.2 and

found ways to develop a better model. In spite of the fact that the model development in

Section 3.6.2 is valid and attempts to drive out some bias, but the methods described here

result in a model with a lower AIC. Given the fact that AIC is a method for trading off

accuracy and complexity, the lowered AIC figure indicates an improvement over the

method we have previously used.

Page 98: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

82

3.6.5.3. A more parsimonious PHM

The PHM in Eq. (14) contains 10 covariates. Given the fact that the data set contains 130

“events” which is not a large sample size, and in order to make this a more parsimonious

model, we repeat the analysis, this time ignoring the interactions. The result is the following

PHM, which now contains 7 covariates instead of the 10 appearing in Eq. (14):

, (15)

Table 3.13 adds further detail to this PHM.

Covariate Parameter Estimate P-Value

- Scale, η 3.029

- Shape, β 1.09 0.2445

z1= Social 1 -0.06766 < 0.01

z2= Analytical 2 -0.02845 < 0.01

z3= X1 3 2.442 < 0.01

z4= X2 4 2.175 < 0.01

z5= Y1 5 -0.8779 < 0.01

z6= Y2 6 -1.374 < 0.01

z7= V1 7 0.9752 < 0.01

Hypothesis: Shape = 1 tested, Gamma (cov) = 0 tested, based on 5% significance level

Table 3.13: Summary of Estimated Parameters

Given the sample size of N=130, one may question the testing of the variables for

significance. This is a difficult question in PH modeling and there is no standard answer to it.

The testing is certainly sensitive to the number of failures and suspensions out of the total

number of events. Less failures result in larger standard errors which may result in the

variable not being found significant. Even with a low total number of events, or a low

number of failures, modeling can still be done but the low number affects the standard

Page 99: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

83

error of each variable. The testing of the model and the rejection of variables as significant

depends on the sample size and is reflected in the model building process.

In this newly obtained PHM in Eq. (15), the shape parameter is not significantly different

from 1. Therefore, we repeat the modeling, and fix the shape parameter to 1. In the

resulting PHM, machine age does not play a role; hazard rate is dependent on the level of

the covariates only.

. (16)

Table 3.14 adds further detail to the PHM in Eq. (16).

Covariate Parameter Estimate P-Value

- Scale, η 2.511

- Shape, β 1 (fixed)

z1= Social 1 -0.06503 < 0.01

z2= Analytical 2 -0.02855 < 0.01

z3= X1 3 2.49 < 0.01

z4= X2 4 2.21 < 0.01

z5= Y1 5 -0.8652 < 0.01

z6= Y2 6 -1.301 < 0.01

z7= V1 7 0.9767 < 0.01

Hypothesis: Gamma (cov) = 0 tested, based on 5% significance level

Table 3.14: Summary of Estimated Parameters

Of the original 9 main effects, 7 appear in Eq. (16). This is similar to Eq. (14) which

contained the same 7 main effect variables. Therefore, the discussion on model terms in

Section 3.6.5.1 is still valid. We test the goodness of fit of this model with a K-S test. The p-

value is 0.48 and as a result, the model fit is not rejected. Similar to our work in Section

3.6.2, we compare the results of this PHM with a logistic regression for validation purposes,

Page 100: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

84

the details of which appear in Appendix B. The highlight of this discussion is that there is a

strong correlation between the two approaches and this serves us in our original purpose of

using one approach to validate the other.

3.6.5.4. Revenue Function and Discussion

Given the PHM in Eq. (16), the expected revenue function remains the same as the

preceding Section (3.6.5). However, the function associated with Alpha’s revenue

function is as follows:

.

Terms in are taken directly from the decision variable matrix. The term represents

the effects of the first decision variable, the addition of a Guide, on the second covariate.

The presence of a Guide will only affect the analytical skill component. As a result, only

covariate which contains the original variable Analytical skill, is multiplied by .

Similar to the preceding Section, we consider various hypothetical sets of inputs to show

the possibility of different optimal solutions. All scenarios take place on the first afternoon

shift of the week, and . We analyze an eight-hour shift on

the Ring machine, , at a working age hours. Adding a Guide has an

effect equivalent to a 25% rise in analytical skill, . Operator’s production count and

failure cost are calculated based on his last 20 shifts.

Scenario 1: An operator with skill scores of {Analytical=64.39, Experience=72.49,

Social=62.44} is scheduled to operate the machine. Average failure cost is $2,000 per

Page 101: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

85

incident; the operator’s hourly production average is 33; the cost of adding a Guide is

$800 per shift; profit per piece is $40; and a 50% reduction in production rate results in

a 15% reduction in hazard rate. As can be seen from Figure 3.9, the optimal course of

action is to run at regular rate but to add a Guide to help the operator.

Scenario 2: the optimal course of action changes if the profit per piece is $10. All other

factors remain the same as in scenario 1. As can be seen in Figure 3.9, the optimal

course of action is to accept the risk and run at a normal rate, with no Guide.

Scenario 3: All input parameters remain the same as scenario 2, except the failure cost

which is increased to $7,000 per incident. As Figure 3.10 shows, the optimal decision is

to take the machine out of production for a shift until a more skilled operator is

available.

Scenario 1: run at rate, add a Guide Scenario 2: Run at rate, no Guide

0

1000

2000

3000

4000

(1,0) (0.5,0) (0,0) (0.5,1) (1,1)

Pro

fit

Values of r, rate, and g, Guide: (r,g)

-400

-200

0

200

400

(1,0) (0.5,0) (0,0) (0.5,1) (1,1)

Pro

fit

Values of r, rate, and g, Guide: (r,g)

Page 102: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

86

Scenario 3: Stop Machine

Figure 3.10: Optimal strategy under various conditions

In scenario 3, we are in a situation where the best course of action is to shut down

the machine until a more skilled operator becomes available. This is obviously an

undesirable situation and the following discussion helps us avoid being in such a loss

scenario.

For the two cases (r, )=(1,0) and (r, )=(0.5,0), intervention methods include

production rate changes but not the addition of a Guide. Therefore, there are no direct

intervention costs, and .

For the remaining two cases (r, )=(0.5,0) and (r, )=(1,0), due to the presence of a

Guide, we have an additional cost, .

Using Eq. (6) and the values in scenario 3, we calculate the profit threshold for each case:

Case 1: (r, )=(1,0) results in an integral which has no closed form; as a result, we use

numerical methods to solve for . We perform a numerical analysis with step-wise

values used for until we get to within three decimal points of zero for the function

value. The result is .

-1000

-750

-500

-250

0

(1, 0) (0.5, 0) (0, 0) (0.5, 1) (1, 1)

Pro

fit

Values of r, rate, and g, Guide: (r,g)

Page 103: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

87

Case 2: (r, )=(0.5,0): Following the same numerical methods procedure described for

case 1, .

Case 3: (r, )=(0.5,1): There is no solution for . This is an example of the no-answer

scenario, described in conjunction with Eq. (7), and displayed in Figure 3.3a. Therefore,

given the circumstances of the scenario, such as the failure cost, and the particular

effects of r and , this shift cannot be profitable, regardless of the operator skill set.

Case 4: (r, )=(1,1): As in case 3, there is no solution for .

As established in Section 3.5, we can have a positive net revenue when .

We are interested in an operator’s skill set (Experience, Analytical, and Social) that satisfies

the inequality , where the values for come from those calculated for the

four cases above. Case 1 has a higher value for . Therefore, the minimum skill set

required for case 2 will have to be higher than this case. Since it is the minimum skill set

that we are interested in calculating, we will not make an effort to calculate it for case 2.

Similarly, we will ignore cases 3 and 4 since we established there can be no skill set that

would yield profitability, given the conditions of cases 3 and 4.

In scenario 3, we consider the first afternoon shift of the week, on the Ring machine.

Therefore, and . Here, the function as applicable to the

current analysis is as follows, with further details provided in Table 3.15:

.

Covariate

Variable represented Social, Analytical,

Table 3.15: Variables represented by the covariates in the function

Page 104: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

88

Given (r, ) = (1,0) and , we reach the following inequality:

.

Only those combinations of skill components, ( that will satisfy the above

inequality will result in positive net revenue. Otherwise, the DM will choose to stop the

process to avoid an expected loss. An example of a skill set resulting in profit is

(80,80); however, skill set (80,70) does not result in a profit and would prompt

the DM to stop the machine. This is displayed graphically in Figure 3.11. It is also interesting

to note that when takes on its maximum value of 100, is calculated to be 71, in order

for the system to be profitable. The graph tells us regardless of the value of , we will not

be profitable when the social interaction score is below . Similarly, when , is

calculated to be 31. This tells us the analytical skill score has to be greater than 31 for the

system to be profitable, regardless of the social skill score.

Figure 3.11: Minimum skill set required to achieve positive net revenue

The cost of failure in scenario 3 is quite high compared to the profit per part. Therefore,

a high operator skill set is required to make a profit. Given this calculated minimum skill

level, and if none of the other factors, such as the average failure cost, can be improved, the

100

100

80

80 70

“No profit” region

Infeasible region

Profitable region

Page 105: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

89

management at Alpha must follow a regimented training procedure. Upon an initial

assessment of his social interaction, the operator must receive machine-specific training to

have a sufficiently high analytical skill score to enable him to be in the positive-revenue

region. Otherwise, there may be shifts when the machine sits idle due to an expected net

loss.

3.7. Concluding Remarks and Future Work

We have discussed a more comprehensive approach towards the usage of the PHM by

including human-related factors. At times, the human part of the human-machine system

may be a significant source of risk. As such, factors associated with the human operator

must be considered in the reliability analysis. One challenge in this process may be the

quantification of human-related factors. Thus far, the factors we have considered are the

operator’s experience level, social interaction abilities, and analytical skills, as well as the

effect of shift work. We have also analyzed the effect of the day of the week and its

interaction with the shifts.

The bulk of the discussions in chapter 3 focuses on developing a PHM with HR covariate.

Once this PHM is successfully obtained, we use it to propose a model that can provide a

decision-maker with a cost-benefit analysis to choose among various intervention methods

to reduce operator-related risk. The proposed revenue model makes use of the

proportional hazards model to estimate the expected machine uptime and the probability

of failure. With this model, the DM can also calculate the risk threshold, below which the

system is profitable, as well as the minimum levels for various human-related factors.

Page 106: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

90

We provide a case study of a manufacturing company and use it to demonstrate the

usage of the revenue model. The factors found to be significant in our model are experience

level, social interaction, analytical skill, shift work, and days of the week. We demonstrate

the decision-making process resulting in the highest profit. Given machine-specific factors,

we calculate minimum levels of operator factors that result in system profitability.

The list of factors we have considered thus far is by no means an all-inclusive list. There

are many other human-related factors that can affect the performance of the operator, and

thereby the system. The inclusion of additional factors can be considered as future work.

These are factors in addition to skill, shift-work, and days-of-the-week. Examples are

motivation, seasonality, reward systems, and management-employee relationships.

Another valuable future work can be to analyze the combined effects of machine-related

as well as human-related factors in a PHM. The effects may be compared and the predictive

power of the model can also be analyzed after the addition of the category of factors. There

are no previous works in the literature that have implemented such analysis. If we do

obtain a PHM that contains both MR- and HR-covariates, the methods of intervention

would be different depending on whether the major source of risk stems from the machine

or the operator. Therefore, there has to be an additional step for risk source identification.

We currently do not distinguish among the failure modes. Performing a certain wrong

task may require a simple reset, causing a ten-minute downtime, whereas performing a

certain wrong sequence may result in physical damage, requiring a lengthy maintenance

and hours of downtime. Therefore, considering different consequences for the various

Page 107: System Performance Analysis Considering Human-related Factors

Failure Risk Analysis

91

failure modes would make our analysis more realistic. Another factor that will make our

discussions more realistic is the consideration of production quality. We have taken a black-

and-white perspective on quality: if the operator is slow or causes a machine downtime, a

part is not produced. But if the part is produced, it is of acceptable quality and can be sold.

Our work will be improved if we can incorporate means to distinguish between rapid, error-

prone work and immaculate, quality-first mentality.

Another future work directly following our PHM work discussed in Section 3.6.5 may

include identifying additional HR intervention methods. We have identified two such

intervention methods, adding a Guide and reducing the production rate. These are not the

only intervention methods and many others may be possible, depending on the context. A

further future work may be to develop a deeper focus on improved estimation of the

effects of the intervention methods. This work would include a focus on accurately

predicting the relationship between the production rate change and the hazard rate

reduction. This work can also include discussions on calculating the effects of a Guide to the

skill of the operator.

Page 108: System Performance Analysis Considering Human-related Factors

92

4. OPTIMAL OPERATOR ASSIGNMENT

The human resources of an organization are among its most valuable assets. Optimal

human resource management can make a major contribution to improving the potential

productivity of the organization. In the specific context of operator assignment in skill-

based environments, expertise can be included as a factor whose improvement can

positively affect system performance.

As mentioned in the introductory chapter, even when quantified skill scores are derived

for the operators, a DM may not know how to best use these scores. This chapter provides

a framework for a DM to use operator skill as a decision variable in optimizing operator

assignment. There are four elements to this framework:

1. The first element is to develop the means to forecast the production output, in terms

of HR factors. The method we use is regression analysis.

2. The next element deals with the fact that skill scores are unlikely to be static over the

planning horizon being considered. A particular operator may be the best candidate

for assignment to a certain machine based on current conditions, but not over the

entire planning horizon. This dynamic nature needs to be captured and considered in

the analysis. To do so, we develop learning curves for the operators, based on

historic data.

3. We develop a revenue model and incorporate the learning curves into the

production output forecasting models.

Page 109: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

93

4. We use the results of the revenue model in step 3 in the objective function of a

mathematical programming model which we use for optimal operator assignment.

Connection to Previous Chapters

In Chapter 3, we analyzed the performance of a system where operators have been

assigned to machines. We know the characteristics of the machines, including their

sensitivity to the various HR-factors. We also know the characteristics of each operator. The

question explored in chapter 3 was the following: given the operator assignments, the

sensitivity of each machine to HR factors, and the characteristics of each operator, how do

we best mitigate the risk of failure, stemming from the operators? This is portrayed in

Figure 4.1 where the arrows represent the assignments of the operators to the machines.

Figure 4.1: Discussions in chapter 3 focused on failure risk analysis given the machine and operator characteristics and operator assignments to machines

Our work in this chapter is similar to Chapter 3 in that we know the characteristics of the

machines, including their sensitivity to the various HR-factors. In this chapter, we want to

assign a group of operators to various machines in order to achieve the best system

performance in terms of maximized revenue. The question explored in this Chapter is in a

way the opposite of the one in Chapter 3: given the sensitivity of each machine to HR

M1

M2 M3

M4 M5

M6 Mn

O1 O2

O3 O4

O5

O6 On

Machines are: same in operational

procedures but different in technical

characteristics

Operators have different skills,

tolerance to shift-work, and other

characteristics.

Page 110: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

94

factors, and the characteristics of each operator, how do we best assign the operators to

the machines in order to maximize system revenue. This is portrayed in Figure 4.2.

Furthermore, the work presented in Chapter 3 has a short term nature to it for the

purposes of decision-making. In its application to a case study, the planning horizon

considered is an eight-hour shift. The work in this Chapter is for long term decision-making

and the empirical study that follows at the end of the Chapter has three months as its

planning horizon.

Figure 4.2: Discussions in chapter 4 to focus on optimal operator assignment, given machine and

operator characteristics

Main Contribution of this Chapter

We create a methodology to optimally assign operators to machines based on the

sensitivity of the machine to HR factors as well as the operators’ current and forecasted

characteristics.

4.1. Literature review

A literature review shows an abundance of previous work in personnel assignment.

Much of this work focuses on forecasting the human resource requirements in order to

? ? ? ? ?

Machines are: same in operational

procedures but different in

technical characteristics.

Operators have different

characteristics; there are different

forecasts/projections for these

characteristics over planning horizon.

M1

M2 M3

M4 M5

M6 Mn

O1 O2

O3 O4

O5

O6 On

Page 111: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

95

produce a certain amount of output (Kao and Lee, 1996; Philipose, 1993). Yang et al., (2003)

focus on determining the level of engineering expertise required in ensuring the desired

production output. Li and Li (2000) consider staff skill as a factor in their analysis. Service

quality and cost minimization are two objectives in their goal programming approach.

However, both Yang et al. (2003) and Li and Li (2000) have a manpower planning scope,

where they aim to forecast the required number of personnel. Our scope is the optimal

assignment of the existing workforce. Our scope focuses on maximizing system revenue by

using the expertise of the people to optimally assign them to the various jobs.

Wang (2005) divides the operations research techniques applied in workforce planning

into four major categories: Markov chain models, computer simulation models,

optimization models and supply chain management through System Dynamics. He breaks

down the optimization model category into linear programming, goal programming,

dynamic programming, and integer programming. Our approach makes use of the former

type of optimization model. Studies such as Haas et al. (2000) and Feiring (1993) use linear

or integer programming models for optimization in operator assignment problems. Zeng et

al. (2011) consider a manufacturing environment and optimize the operator assignment

using a Pareto utility discrete differential evolution algorithm. However, none has a scope

similar to ours in terms of using learning curves of skill components of the individual

operators. The aforementioned study by Yang et al. (2003) does acknowledge the role of

learning curves but assumes it to be negligible in their study.

Page 112: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

96

Some studies, such as Malhotra et al. (1993), and Li and Cheng (1994), discuss the role of

learning on system performance. Others, such as Teyarachakul et al. (2011), consider the

role of learning on production scheduling. When it comes to considering the role of learning

on the process of workforce assignment, the literature is sparse. We were unable to find

similar work that directly uses learning curves during an optimization process for workforce

assignment. There are works, for example Nembhard and Norman (2002), followed by

Leopairote (2003), and Vidic (2008), that discuss the effects of work-sharing and job-

rotation on operator learning and forgetting. These studies come closest to our ideas and

approach. But these models aim to optimize operator learning and operator forgetting as a

result of work-sharing and job-rotation. Unlike our scope, the aforementioned works are

not motivated by revenue maximization. Nor do they use learning curves for multiple skill

components of individual operators. Therefore, our proposed methodology is unique in its

scope to enable the DM to optimally assign operators to machines based on current and

forecasted HR factors, with the aim of maximizing system revenue.

4.2. Model Development

We forecast machine output based on the factors affecting the operator working on the

machine. Learning curves are developed for the characteristics of the operator. Operator

factors as well as learning curves are then incorporated into a revenue model that

calculates the expected revenue of an operator, on a given machine, over a planning

horizon. Unlike the work presented in Chapter 3, the discussions in this Chapter are

deterministic. The possibility of machine failure is not explicitly considered but rather

Page 113: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

97

incorporated in the average output. The operator assignment is solely performed based on

the machine sensitivity to HR factors, as well as the current and forecasted HR factors.

Predicting Output in Terms of Human-related Factors

Our objective is to maximize total revenue over the planning horizon. Revenue is directly

proportional to the profit per unit as well as the output of each production unit within the

system. In a manufacturing environment, we assume each machine to be a production unit

and we predict its output as a function of HR factors. The sensitivity of each machine to HR

factors, and the characteristics of the operator working on the machine, affects the output,

and therefore the revenue.

The first step in building a mathematical model to optimize operator assignment is to

develop a model to forecast the production output. One such model can be obtained using

regression analysis. In the regression equations we obtain, the dependent variable is the

hourly production output.

Operator characteristics, the independent variable in the regression equations, are

considered on their own as main effects. But their interactions must also be considered if

they make intuitive sense in the context of analysis. Therefore, the regression equations

have a first part for variables appearing on their own and a second part for pair wise

interactions. The regression equation has the following general form:

, (1)

where

Page 114: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

98

: Estimated output of machine , per unit time, where , : number of

machines,

operator characteristic ,

: coefficients of the main effects, as applicable to machine ,

: coefficients of the interaction terms, as applicable to machine ,

and : Indices iterating through the various operator characteristics considered.

, , where l is the total number of operator characteristics

considered.

It is likely that a model will not have interaction terms passed the second level. But if the

usage of third level, or higher, interaction terms are justified, they can be added to the

model accordingly. In a case where the usage of third level interaction needs to be

considered, Eq. (1) will be expressed as follows:

.

Learning Curves

Since operators learn, their skills can change over time. The differences in learning

among the operators need to be considered in the work assignments. Consider operator A,

who has slightly higher initial skill at time zero, and operator B, who has lower initial skill

but a steeper learning curve. Over the planning horizon, total production by operator B

could be higher than by operator A (Figure 4.3). The area under the curve represents total

production over the planning horizon; therefore, operator B should be assigned to the

higher priority machine for the system to gain the additional production output.

Page 115: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

99

Figure 4.3: Quarterly production of two operators, with different projected learning curves

Revenue Model Using Regression Equations and Learning Curves

Over the course of the planning horizon, the revenue model for each machine must

include the expected hourly production of each operator over the period. This is achieved

by incorporating the learning curves into the regression equations. When using equation

(1), rather than one input value for variable , we use the integral of the learning curve

for that particular characteristic. By calculating the integral of equation, with the learning

curves built in as previously described, we effectively sum the production output over the

planning horizon.

Output is forecasted over the planning horizon using regression equations which are

linear. But the learning curves embedded in these linear equations are non-linear. Skill

evolves and the evolution is represented by the learning curve equations. But regardless of

the form of the learning curves, they serve to provide the regression equation with one

value of skill at a particular instant in time. Therefore, there is no issue in embedding a non-

linear model within an analysis that is linear as a whole.

31

32

33

34

35

36

37

0 120 240 360 480 H

ou

rly

Pro

du

ctio

n

Time (hours)

A

Production

B

Production

Page 116: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

100

Over the next planning horizon, [ ], the expected revenue, U, of operator s

working on machine is represented by the following:

, (2)

: Sale price of units produced by machine ,

: Time variable within the planning horizon .

: Learning curve equation of operator s, a function of , and .

4.3. Optimization Model

Revenue is maximized over the next planning horizon by using the revenue model

expressed in Eq. (2) to assign operators to each machine. We take on a mathematical

programming approach within the context of assigning a number of operators to a set of

machines, given a set of criteria and an objective function. We define this objective function

as the total revenue over the period and our aim is to maximize it. As previously stated, we

make the assumption that every unit produced can be sold. The binary decision variable

operator is assigned to machine (making product ) and is zero otherwise. This

model is a simple assignment problem where m jobs are assigned to m individuals

(Emrouznejad et al., 2012). In our case, each operator is assigned to one machine, and each

machine gets assigned the appropriate number of operators, .

Page 117: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

101

max

s.t.

By solving the above linear programming problem, we can assign individual operators to

specific machines to optimize the departmental revenue over the planning horizon.

4.4. Empirical Study

Once again, we use the case study of Alpha, introduced in Section 3.6. In addition to the

case study details previously provided, the following facts are pertinent to the discussions in

this chapter:

▫ Product demand is high and all gears produced can be sold. Therefore, the decision

maker should aim to produce the highest number of gears, while considering the sale

price of each gear.

▫ The machine operators are assigned to the machines on a “random” basis. The operators

have transferred into Alpha at different times and each one was assigned to a free

machine when they first joined. There was no particular policy in the assignment. They

do not switch machines in the duration of our study. There is no personnel turn-over in

Alpha during our analysis. As an example, the operator who is assigned to the Ring gear

machine works on the Ring gear machine every week.

Page 118: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

102

▫ There are nine operators on the three shifts and the three Kappa machines. Operators

are assigned to machines, but not to shifts. They go through a weekly shift rotation.

▫ The Kappa machines are almost identical the only difference being the external tooling

for the gear they produce. The operational procedures are the same and if a DM decides

to transfer an operator from one line to the next, the adjustment time for the operator is

negligible.

▫ There are no union rules prohibiting management from transferring operators from line

to line and the operators have no preference for a particular Kappa machine.

4.4.1. Predicting Output in Terms of Human-related Factors

The production data for one of the machines, Pinion, is incomplete. Therefore, it is

omitted from the output analysis. Its data is complete for the operator skill assessments;

therefore, it is included in the analysis of operator skills where overall averages are

calculated. Table 4.1 displays the coefficients of the variables that form the regression

equations for each of the machines. Based on Eq. (1), the general form of the regression

equation for Alpha is the following:

.

Coefficient estimates not significantly different from zero for all machines have been

omitted. Similar to our model building approach in Chapter 3, we primarily use a backward

selection model and complement it with AIC. Components of skill (Experience, E; Social, S;

Analytical, A) and their interactions are the only ones appearing in the regression models;

shift indicators are not found to be significant. To validate the obtained equations, we have

Page 119: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

103

looked at R2 values for assessing model fit and checked for linearity of residuals,

homoscedasticity, independence of errors, testing for influential cases, no perfect

multicollinearity. We also split the data to cross-validate the models. The analysis appears

in Appendix D. Details of the data set appear in Appendix G.

Driven 18.664 0 0.051 0.214 0 0 0

Drive 29.935 0 0 0.086 0.001 0 0

Ring 23.721 0.090 0 0 0 0 0.001

Table 4.1: Regression Equation Coefficients, significant at p < 0.01

4.4.2. Learning Curves

The operators at Alpha have skill assessments on a quarterly basis in the nine-month

duration of our study. Based on these assessments, learning curves for each operator’s skill

component are developed by fitting curves to the historical skill scores. At the time of this

analysis, each operator has been assessed three times. Each assessment round consists of

two expert evaluations which are done one to three weeks apart. Therefore, for each

individual, we can fit a curve using six points.

For the analytical skill, we expect everyone to learn and improve their technical skills as

they spend more time working on the machines. There are a few operators who have

already achieved the highest level of expertise, as measured by our data collection tools.

Therefore, the curve is replaced by the score of 100. The power form is the form we have

chosen for the learning curves as it is the most common one in the literature for groups

(Agrote et al, 1995).

Page 120: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

104

In general, for an environment like that of Alpha, where all operators are between 40

and 60, it is best to consider the effect of aging. Learning may not occur as rapidly as it

might in an environment with a much younger workforce. There may also be quicker

forgetting effects, especially if an operator is on an extended leave for medical reasons.

However, in our analysis, we assume forgetting is negligible because the operators are

working on the machines every week. We also assume the trend in learning is continuous

along the path of a power curve. It may stabilize and plateau, but this is the case with

operators of all age groups.

As mentioned in the skill assessment process in Section 3.6.1, the twelve operators are

assessed twice within a month during each of the three previous quarterly rounds of skill

evaluation. The validity of the power form for the learning curves for analytical skill is

reinforced by fitting a curve to the average score of all operators on each week. Each point

on the graph presented in Figure 4 represents the average of all operators who were

assessed in that particular week.

We also show the curve fitted to these points, as well as the resulting R2.

Figure 4.4: Learning curve for aggregate analytical skill scores of all operators over all weeks

z = 10.368m0.2803 R² = 0.7709

0

20

40

60

80

100

0 500 1000 1500

An

alyt

ical

Ski

ll Sc

ore

Hours of production

Page 121: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

105

The non-negative nature of the learning curves for the analytical skill is not the case with

social interactions. In the case of this skill component, some operators experience a

downward trend. It seems that an operator may get to a certain knowledge level where he

believes no fellow operator can be of assistance in his trouble-shooting efforts. In this case,

his social interaction score will actually be lower than when he had a lower analytical

and/or experience score. The literature is very sparse on works that look at the relationship

between work-related communication and operator skill. There are prior works, such as

Allen (1977), that describe the diminishing of work-related communication with increased

physical distance of the operators. The closest works we found in the literature on the

relationship between work-related communication and operator skill is by Woodman et al.

(1993) and Singh and Fleming (2010). In their analysis, they describe the relationship

between team performance and time as an inverted U. Initially, the operators learn from

each other and the creativity and expertise of the whole group is increased as a result. After

a point though, the operators become clones of each other. A downward trend begins due

to a loss of diversity (same views) and the fact that group identity rejects outside views. In

our case, we state that as time goes on, the operators gain expertise. Therefore, the

relationship described in the two abovementioned works is indirectly applicable to us. Even

though this description of the effect of communication on performance is at the team level,

it may be transferable to the social abilities of the individual operator as well.

When the data for all operators is grouped, the pattern is that of a nearly-zero-sloped

straight line (Figure 4.5). As time on the job progresses, there are individuals who interact

more. But the social skills of these operators seem to be counteracted by other operators

Page 122: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

106

whose social interaction score diminishes as they spend more time on the job. Unlike the

curves for the analytical skill, we do not have a common form we can use for the social

interaction scores. This makes intuitive sense as social interaction has its roots in the

personality of the operator and all operators are different. In addition, the quality of our

curve fitting is not as high as for the analytical skill, with lower R2 value for the curves.

Weeks of Production

Figure 4.5: Learning curve for aggregate social skill scores of all operators over all weeks

The learning curves obtained for Alpha’s fourth quarter appear in Table 4.2. As the

experience level can be obtained directly from employee records of the operator’s duration

in Alpha, as well as his overall exposure to gear manufacturing processes, no curve is

associated with this component. The independent variable, m, is in terms of hours of

production since the start of the study.

0%

10%

20%

30%

40%

50%

60%

70%

80%

0 10 20 30 40

Soci

al In

tera

ctio

n S

core

Page 123: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

107

Operator Experience Level

Learning Curve and R2 obtained

Social Interaction Analytical Skill

1 100.00 , 100

2 71.43

3 57.14

4 100.00 ,

5 71.43 ,

6 71.43 ,

7 100.00 ,

8 85.71 ,

9 57.14 ,

Table 4.2: Learning Curves of Operators’ Skill Components

4.4.3. Revenue Model Using Regression Equations and Learning Curves

To further clarify our method of incorporating learning curves and production forecasting

models into a revenue model, we consider operator 8 working on the Drive line and

determine his expected revenue over the fourth quarter.

▫ Per Table 4.1, the regression equation for the Drive line is as follows:

, where : experience level; : social

interaction; and : analytical skill.

▫ We are interested in the production hours over fourth quarter. In each quarter, the hours

are: hours/quarter.

Therefore, by the end of the third quarter, we have gone through 1440 hours.

▫ Using the learning curves for operator 8 from Table 4.2, and the sale price, = $90,

we arrive at the following expected revenue output for the fourth quarter:

.

Page 124: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

108

4.4.4. Optimization Model and Discussion

Each machine runs three eight-hour shifts per day and thus requires three operators.

There are nine operators to be assigned to the three machines. Using Eq. (2), the operators’

expected revenue on each machine forms in the mathematical programming model.

We have also categorized the operators into three, “High” (H), “Medium” (M), and “Low”

(L), based on the ranking of their total production output. This is expressed in Table 4.3,

along with the categorization of the operators.

Operator Driven Drive Ring Total Rank Category

1 2,300,460 1,961,640 1,710,470 5,972,570 2 High

2 1,932,940 1,758,290 1,599,590 5,290,820 7 Low

3 1,662,920 1,581,020 1,421,930 4,665,870 9 Low

4 2,305,860 1,965,710 1,714,320 5,985,890 1 High

5 2,248,420 1,854,290 1,577,260 5,679,960 5 Medium

6 1,900,750 1,687,530 1,432,420 5,020,700 8 Low

7 2,235,670 1,881,730 1,633,030 5,750,430 4 Medium

8 2,082,570 1,816,660 1,565,930 5,465,170 6 Medium

9 2,340,990 1,881,160 1,629,640 5,851,780 3 High

Table 4.3: Operators’ Expected Quarterly Revenues

4.4.5. Optimal Solution Compared to Solutions of Other Methods

The hourly production output calculated from each machine’s regression equation is

directly proportional to the skill of the operators assigned to that particular machine. Three

operators are assigned to each line to cover the three shifts. The scores for the skill

components of each operator appear in Table 4.4.

Page 125: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

109

Line Operator Experience Social Analytical

Driven gear

1 100 67.71 100

2 71.43 70.83 63.64

3 57.14 40.10 48.87

Drive gear

4 100 67.71 97.78

5 71.43 66.67 93.80

6 71.43 41.67 60.27

Ring gear

7 100 53.13 95.85

8 85.71 57.29 80.15

9 57.14 84.38 82.35

Table 4.4: Operators’ Skill Components

To date, the department has not used any decision criteria for its Operator Assignment

(OA). As previously described in Section 4.4, positions have been filled on a random basis.

The department’s quarterly revenue, based on production on 5-day weeks in 12-week

quarters, is calculated using the unit sale prices of $110, $90, and $90 for the Driven, Drive,

and Ring gears, respectively.

The revenue amount resulting from the random assignment is compared to other

assignment policies, including the figure we obtain through our optimization procedure. In

one possible approach, the DM at Alpha can use the current skill levels of the operators to

assign the operators. Under this “simple” assignment policy based on current skill rankings,

if operator A’s skill scores are higher than operator B, the DM assigns operator A to

machine 1 which has a higher sale price than machine 2. Using the learning curves of the

operators is unique to our approach and would not play a role in the assignment policy by

the DM in this hypothetical scenario. It is, however, used in our calculation of quarterly

revenue. In obtaining the result of our approach, we use a Linear Programming (LP)

Page 126: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

110

technique to optimize the OA for the goal of revenue maximization. We use the branch-

and-bound method, embedded in LINGO, an optimization software application. Table 4.5

displays the results of the various approaches and their comparison to our optimal OA. As

can be seen, our OA results in a significant amount of quarterly revenues.

Assignment Policy Quarterly Revenue ($) Revenue difference with optimal OA

($) (%)

Random 16,232,450 794,950 4.9

Simple skill ranking 16,743,970 283,440 1.7

Optimal OA 17,027,400 - -

Worst case 16,077,900 949,500 5.9

Table 4.5: Comparing system revenue under various OA approaches

We re-run the assignment model, without considering the effect of learning. Skill scores

start at the assessed level at the start of the fourth quarter and remain flat throughout the

entire planning horizon. System revenue is calculated to be $16,542,060. When we

compare this figure to the revenue obtained from the optimal assignment scenario,

$17,027,400, we can see a difference of $485,340, or 2.9% less revenue. This difference in

revenue is a relatively important and goes to show that we achieve better results when we

are able to include more information in the model. The additional information included in

the model in this case is the learning effects.

4.4.6. Sensitivity Analysis

We perform a sensitivity analysis on our model to verify its performance. There are

certain changes we can apply to the various resources used by our model. The resources to

alter are product price, skill score, and available time. Before making each change, we can

Page 127: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

111

have a certain expectation of how the model should behave. We change the resource level

and check the model performance against expectations.

Sensitivity Analysis: Product Price

In our analysis thus far, we have used the sale prices of $110 for the Driven gears, and

$90 for Drive and Ring gears. We perform a sensitivity analysis on our model by changing

the prices and observing the behavior of our model. A low and a high price is selected for

each gear; they are (90,120), (80, 100), and (70, 90) for the Driven, Drive, and Ring gears,

respectively. Table 4.6 shows the resulting OA for the eight scenarios.

There are only two assignment scenarios when the Ring machine is assigned better

operators than the Drive machine and that is when Ring’s selling price is higher. Looking at

the regression equations for these two lines, we can see very similar coefficients and,

therefore, similar sensitivity to skill level. But Table 4.3 shows us that Drive’s performance

in general is higher than that of Ring. In most cases, Ring price is lowest among the three

products and, as a result, this machine is assigned the three lowest skilled operators. The

only exceptions to this rule are the two cases where Ring price is higher than Drive price. In

those two cases, the operators commonly assigned to Drive and Ring get their assignment

reversed. The Ring sale price is never higher than the Driven and as such, it is never

assigned the operators assigned to the Driven machine. Therefore, in the case of Ring, the

model assignments are aligned with what we may expect.

Page 128: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

112

Driven price

Drive price

Ring price

Driven assignment

Drive assignment

Ring assignment

System Revenue

90

80 70 M-M-H H-H-M L-L-L 14,154,100

90 M-M-H L-L-L H-H-M 15,178,500

100 70 M-M-H H-H-M L-L-L 14,721,400

90 M-M-H H-H-M L-L-L 15,718,500

120

80 70 M-H-H H-M-M L-L-L 16,017,400

90 M-M-H L-L-L H-H-M 17,041,600

100 70 M-M-H H-H-M L-L-L 16,582,800

90 M-M-H H-H-M L-L-L 17,579,900

Table 4.6: Operator Assignment for Different Gear Prices

Furthermore, one might expect the Driven line to get assigned the three “high”

operators when its selling price is at the high end of $120 and Drive and Ring are at their

low end of $80 and $70, respectively. If we proceed with this intuition and assign the three

best operators to the Driven machine, system revenue would be $15,838,467. But this

would not account for the sensitivity of each machine to the different skill components.

Once we apply our model and consider all aspects, we get the assignment presented in

Table 4.6. With our optimal assignment scenario, system revenue is $16,017,400. This is a

higher revenue amount and therefore, the model has properly maximized system revenue.

Sensitivity Analysis: Skill Scores

The values appearing in Table 4.3 are calculated based on the current skill levels and

learning curves (shown in Tables 4.4 and 4.2, respectively). Using the original gear prices,

we change the social interaction score of the operators to observe the differences, if any,

on the optimal OA and system output. There are some operators who have already reached

the maximum score (based on the current criteria) on the other skill components of

experience level and analytical skill. Therefore, the social interaction component is a

Page 129: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

113

suitable one to alter as it can go in either direction at the desired step size of 5% or 10% for

all operators.

The social skill of the three operators in the “high” category is reduced by 10% and that

of the three “medium” operators is raised by 5%. We recalculate the aggregate production

figures for the six operators using the updated Social skill values. Despite the change in the

individual values, the changes are not sufficiently large to modify the ranking of the six

operators, and as a result, the medium and high categorizations of the six operators remain

unchanged. As such, one may expect the operator assignment to also remain unchanged.

As scenario 1, we assume no change in categorization should result in no change in OA.

We recalculate the system revenue using the new values of social skill, but without

changing the operator assignments. But operator categorization is not the only factor that

can affect operator assignment. We have to be cognizant of the other factors, including

production output based on current and forecasted skill, product sale price, and the

sensitivity of the machine to the skill components. Therefore, as scenario 2, we run our

model and we arrive at a modified OA, resulting in higher revenue compared to scenario 1

(Table 4.7). Therefore, the model works well in providing us the assignment that results in

the highest possible revenue. The model has captured and considered all the factors that

should determine the optimal OA.

Page 130: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

114

Original Scenario Scenario 1 Scenario 2

Driven OA 5-7-9 5-7-9 4-5-9

Drive OA 2-4-6 2-4-6 1-2-6

Ring OA 1-3-8 1-3-8 3-7-8

Total Revenue ($) 17,027,400 16,800,960 16,802,360

Table 4.7: System Revenue under various OA Scenarios

It is interesting to note that the particular sensitivity analysis described above can be

used in determining the value for various programs for the operators. Examples of such

programs can be machine-related technical training to enhance analytical skill, or

motivational programs and team-building exercises targeting improved social interaction

among the operators. By estimating the effect of any program on the skill component(s) of

the operator(s), the initial price can be compared with the eventual addition of quarterly

revenue for a cost-benefit analysis.

Consider the example where Alpha enrolls its nine operators in a team-building exercise

that is estimated to improve social interaction skill of each person by 5%. The total cost of

the exercise is $50,000. Based on the figure of $17,027,400 presented in Table 4.5, if our

model calculates the departmental revenue to be greater than $17,077,400 using the new

social skill scores, the exercise as a project has a payback of less than three months.

Sensitivity Analysis: Available Production Time

When calculating the results of the optimal OA, we used the same duration of five 8-hour

shifts per week, for 12 weeks for all three machines, resulting in a total of 480 hours over

the planning horizon. In a hypothetical scenario, this criterion is changed to 360, 480, and

600 hours for the Driven, Drive, and the Ring gear line, respectively. The Driven gear line is

Page 131: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

115

the one with the highest product sale price. We allow it to run the least number of hours

and observe the effect on OA.

Our model provides a result that makes sense; assignment is updated to provide the

highest system revenue. Under this new scenario, the Ring line has the highest production

time available. Therefore, the worst operator previously assigned to it (from the “low”

category) is replaced by an operator from the “high” category. The operator swap is done

with the Drive line rather than the Driven line because the Driven gear is sold at a higher

price ($110), compared to the Drive line ($90).

This particular sensitivity analysis may be beneficial to the DM for situations when there

is a resource shortage. Examples of such resources may be production raw material or

human operators. In such cases, the DM may want to dedicate the scarce resources to two

machines and shut down the third machine. The decision as to which machine to shut down

can come from an analysis similar to what we have done in this sensitivity analysis.

Consider an example where there is a mandate to reduce manpower by one-third for

two weeks. The DM can reduce the production time of one machine by 80 hours, reducing

the total quarterly running time of one of the three machines from 480 to 400 hours. The

total production time for the other two machines remains at 480 hours over the quarter.

Our model provides the system revenue for each of the three scenarios of Driven, Drive, or

Ring running for 400 hours. The DM would then choose the best option accordingly. This is

shown in Table 4.8:

Page 132: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

116

System revenue ($)

Driven at 400 hours, Drive and Ring at 480 hours 15,868,900

Drive at 400 hours, Driven and Ring at 480 hours 16,112,300

Ring at 400 hours, Driven and Drive at 480 hours 16,127,500

Table 4.8: System Revenue under different production times

As can be seen, in this case, it makes economic sense for the DM to shut down the Ring

machine for two weeks (80 hours) and keep Driven and Drive for the entire duration.

4.5. Concluding Remarks and Future work

In this chapter, we have discussed the personnel assignment problem and created a new

methodology to solve it using a linear programming approach to optimize personnel

assignment. The result of the mathematical programming model is superior to the current

first-come-first-assigned scenario. In developing the mathematical model, we use output

forecasting models in the form of regression equations stemming from our case study’s

data set. In these equations, output is expressed as a function of operator skill. Skill is

categorized into various components and quantified by consulting system experts in

questionnaires and through observational studies. We initially considered another human-

related factor, the effect of shift-work on the production output, in the regression analysis.

However, this effect was not found to be significant in any of the three product lines.

In addition to forecasting the production output in terms of skill, we use operators’

learning curves for each of the skill components. The learning curves are projected over the

planning horizon and incorporated into the production forecasting model. This combination

is used in the objective function of our linear programming model. The result achieved by

Page 133: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

117

our optimal OA is compared to the random assignment currently in practice and a simple

linear programming technique, not considering the forecasted skill levels. Our method is

found to result in higher revenues compared to the other two methods. For validation

purposes, a sensitivity analysis is performed to confirm the validity of our model. Three

resources, product selling price, skill score, and production time, are varied. In each case,

the model adjusts the OA to maximize system revenue. As we discuss the sensitivity

analysis, the main focus is on model validation. However, we also have some discussion on

ways our proposed sensitivity analyses can be used for planning and decision making.

Other than the general future work of considering additional HR factors, a possible

future work stemming from the work in this chapter is to apply the approach in this chapter

to other case studies in different environments. These can be non-manufacturing

environments, such as project management, where tasks are non-repetitive. In addition,

one can consider manufacturing environments where there are different machine types.

This can lead to the additional complexity of the personnel having different learning curves

on different machines. The combination of non-repetitive tasks with the presence of

different task types can lead to our consideration of operator forgetting in addition to

operator learning.

In addition, the same methodology may be pursued with objectives other than revenue

maximization, such as total maintenance cost. Differing failure rates of machines based on

the human operator would lead to differing failure rates and this can be a part of the

objective function. As further future work, one may consider a social experiment where the

Page 134: System Performance Analysis Considering Human-related Factors

OPTIMAL OPERATOR ASSIGNMENT

118

operators are assigned to the machines based on their preference rankings. Over the course

of planning horizon, the results of the assignment based on the operator preferences can

then be compared to the optimal assignment we have developed using the model

developed in this chapter.

Lastly, an important future work can be the determination of the length of the planning

horizon to consider. In the work presented in this chapter, we have assumed the DM can

determine the length of the planning horizon. In the empirical study, the duration is taken

to be three months so that it is aligned with the quarterly operator assessment cycles.

However, there can be further work to provide a systematic tool for choosing an optimal

length for the planning horizon. Factors related to machine performance or operator

learning can be used to determine this optimal length.

Page 135: System Performance Analysis Considering Human-related Factors

119

5. EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

In chapter 3, we analyzed a system where HR factors can affect the failure risk of the

system, and developed a model for a DM to perform a cost-benefit analysis to choose the

best intervention method for mitigating failure risk. In this chapter, we consider the same

type of human-machine system where failures can occur as a result of operator error as

well as the machine’s physical components. The purpose of our work is to aid the DM with

planning activities by providing him/her with the expected production output of each

operator. Examples of such planning activities are operator assignment, calculating the

upper bounds on production materials required, estimating maintenance resources, or

establishing operator training programs.

For a human-machine system, as operators learn, there may be two benefits: 1)

improved production output rate, and 2) reduced human error rate. Both of these factors

result in higher performance of the system. The study of the effects of human learning on

the performance of the human-machine system has received much attention; studies such

as Yelle (1979) and Dutton and Thomas (1984) state that the time to produce a single unit

continuously decreases with the processing of additional units. Operator learning certainly

plays a role in manufacturing environments and learning effects have been proven to exist

by many empirical studies (e.g. Venezia, 1985; Webb, 1994). Considering the learning of the

individual operators and using it as a variable in performance optimization can have many

benefits for an organization. Onkham et al. (2012) discuss the benefits an organization may

realize by providing training to the employees, resulting in increased skill and knowledge as

Page 136: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

120

well as reducing human error. This negative relationship between knowledge and failure

rate is also discussed by Fritzsche (2012).

We aim to forecast the production output of an operator over a planning horizon,

considering the effects of learning, among other factors. The operator gains in expertise

based on the amount of time he has spent working on the machine, while the machine is

operational. If the machine is down, he is not working and consequently not gaining

experience. Therefore, his learning curve is proportional to machine uptime. Furthermore,

as the operator gains in expertise, he is less likely to make mistakes; therefore, the

probability of success (or failure) at various stages along the planning horizon is not

constant. This probability can be calculated at each stage based on the appropriate value of

operator-related factors, including those affected by learning. The analysis is performed in

intervals and the probability of failure at each interval depends on the values of the MR-

and HR-factors at the previous interval. This is the reason we choose a Markov chain

approach. A Markov chain is a stochastic process that possesses the Markov property: when

we know the present state of the process, the future development is independent of

anything that has occurred in the past (Rausand and Hoyland, 2004). In our case, we have to

use a non-homogenous Markov chain as operator expertise, learning, and working

conditions, are a function of time.

Our Markov Chain model quantitatively captures the positive effects of learning both in

terms of raised skill, leading to increased output, as well as reduced human error, leading to

decreased machine downtime. In general, machine downtime can be caused by MR or HR

factors. There are many reliability and failure risk analysis models that deal with the

Page 137: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

121

machinery. However, there are few that incorporate the role of human operators on

uptime and overall performance. To obtain the probability of machine failure due to both

MR- and HR-factors, we use the PHM, due to the merits and applications discussed in

Section 3.3.

There can be numerous applications for calculating the expected value of an operator’s

production output, considering the effects of learning. Similar to Chapter 4, we consider

operator assignment as one possible application of the discussions in this chapter. We can

calculate the expected production output for each operator on each machine and use the

results as input factors into the objective function of an operator assignment programming

model. By optimizing this model, we maximize the system performance, thus maximizing

system revenue.

Connection to Previous Chapters

This Chapter uses elements from both Chapters 3 and 4. In order to calculate the

probability of machine failure at the various stages along the planning horizon, we use a

PHM which includes HR covariates. This follows from our work in Chapter 3. Over the length

of the planning horizon, we do not expect the HR factors such as skill to remain static; we

use learning curves to capture the effect. Lastly, our model provides us with the expected

number of operational time intervals the machine will experience. We then use regression

equations to forecast the level of output based on the number of operational intervals. The

learning curves and the regression equations follow from our work in Chapter 4.

Page 138: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

122

Unlike Chapter 3, and similar to Chapter 4, the work presented in this Chapter has a long

term focus for decision-making. The empirical work presented at the end of the Chapter has

a planning horizon of three months.

Main Contribution of this Chapter

1. We use a Markov chain approach to forecast production output, considering operator

learning.

2. The work we present can add value in the interface of operations management and

human resource management. This method can provide a return-on-investment

analysis for training cost vs. additional revenue. The additional revenue would be gained

as a result of more output produced by the operators whose skill is improved due to

training.

5.1. Literature Review

When one aims to analyze the effects of human-related factors on failure risk analysis of

human-machine systems, there are numerous studies such as those using human reliability

analysis techniques (Cacciabue, 2005; Chang and Wang, 2010) or failure modes effects and

analysis (Pillay and Wang, 2003, Seyed-Hosseini et al., 2006). However, when this analysis is

to be used for equipment uptime or system performance analysis rather than just risk

management, the literature is sparse. Horberry et al. (2010) discuss human factors and their

effects on operations and maintenance in a mining context but do not attempt failure

prediction. Similarly, Kolarik et al. (2004) develop a model to monitor and predict an

operator’s performance using a fuzzy logic-based assessment. But the purpose of their work

Page 139: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

123

is solely to provide a human reliability assessment, without providing any methods for risk

reduction. Blanks (2007) discusses the need for improving reliability prediction, paying

special attention to human error causes and prevention, but does not mention any

predictive techniques for human reliability. Carr and Christer (2003) and Dhillon and Liu

(2006) focus on the maintenance workforce performing repair work at times when

machines are not being used for production purposes. Reer (1994) discusses human

reliability in emergency situations, not regular production. There are works in the literature,

such as those by Peng and Dong (2011) and Iakovou et al (1999), that use a Markov chain

approach for uptime prediction. However, none use human-related factors in their failure

risk analysis and the calculation of transition probabilities. Therefore, the focus of all

aforementioned works differs from ours in that we aim to predict the uptime of production

equipment, based on the analysis of the risk of failure stemming from the human operator.

In addition to the scarcity of the previous works analyzing the performance of human-

machine systems from a human perspective, there are even fewer that do so while

considering human learning. Biskup (2008) performs a state-of-the-art review on the effects

of learning on production scheduling. Li and Cheng (1994) and Teyarachakul et al. (2011),

also study production scheduling and consider the effects of both learning and forgetting.

But neither study has a focus on failure risk analysis; nor do they focus on effects of learning

on decreased human error rate and improved system performance.

Malhotra et al. (1993) discuss the role of learning on system performance. But the

discussions are based on optimizing the cross-training of employees to strike a balance

between flexibility and throughput loss due to forgetting and training. Similar to the

Page 140: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

124

aforementioned studies in the area of scheduling, this study does not have a detailed focus

on failure risk analysis or increased production output as a result of operator learning.

Similarly, there are works by Nembhard and Norman (2002), Leopairote (2003), and Vidic

(2008), that discuss the effects of work-sharing and job-rotation on operator learning and

forgetting. But these models aim to optimize operator learning and forgetting as a result of

work-sharing and job-rotation. Unlike our scope, the aforementioned works do not attempt

to predict equipment uptime using a failure risk analysis based on human-related factors.

Nor do they use learning curves for skill levels of individual operators. Adamides et al.,

(2004) consider human learning, among other factors, to lead to increased productivity of

maintenance activities. However, they do not distinguish among the various individuals and

consider the same rate of productivity improvement for all individuals based on a certain

duration working on the particular product.

5.2. Markov Chain Approach

We aim to calculate the expected production output of an operator on a machine, over a

planning horizon. To achieve this, we need to know the possible states of the system at the

end of the planning horizon, along with the state probabilities and their corresponding

production output levels.

To turn our problem into a feasible one to solve, we take the simplifying yet realistic step

to discretize the planning horizon into individual time intervals; the intervals are then

analyzed using a non-homogenous Markov chain. For each of the resulting N intervals, the

probability of failure/survival is calculated. We start from an initial condition with a certain

Page 141: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

125

machine working age and an initial level of operator expertise. At the first interval, there

can only be one of two outcomes: failure or survival. There is a transition probability

associated with each outcome and this probability is calculated using a PHM. Once again,

the fact that the model accommodates the inclusion of HR covariates, in addition to MR

factors, makes the PHM a suitable model for our analysis.

If the machine survives one interval, it reaches the beginning of the next one and again

faces the two outcomes of survival or failure. However, if the machine fails over an interval,

it must remain in repair for a certain period of time, D. We make two assumptions about D.

The first is a simplifying assumption, stating that D is fixed, regardless of the type of failure.

A realistic example is to set D equal to the MTTR. The second assumption is that the

machine is brought back to zero-age following the repair. Therefore, in the general

framework of our approach, multiple states are possible at each time interval. Each state

can lead to at least one path, for repair, and at most two paths, for failure/survival. There is

a transition probability associated with the paths leading from each state.

Hence, the state space of the Markov chain is represented by a three-dimensional

vector: , where . Given the range of these

variables, as well as the fact that in this chapter, we are analyzing a specific planning

horizon, both parameter space and the state space are discrete and finite. For an interval

, the values of the three variables are considered at the instant in time

immediately before n. This is further explained in the next paragraph. The quantity

n , is a discrete indexing parameter and can be thought of as the

Page 142: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

126

machine’s global age, calculated from the start of the planning horizon. There is a set of

initial condition that apply at .

The first variable, i, expresses the cumulative number of time units that the machine has

been operational since the start at . This determines the appropriate PHM covariate

level, used to calculate the hazard rate. Only covariates whose values can be determined by

a degradation-type model can be considered in our model. In the case of operator

expertise, the values are forecasted using learning curve equations. It is this effect of

learning, reducing human error rate over time, which makes our process a non-

homogenous Markov process. The second variable, a, provides the machine working age,

again a necessary factor in the calculation of the hazard rate. At each failure, a is reset to

zero and remains at zero during repair time. The last variable, d, states the remaining repair

time. A positive d indicates the continuation of the repair procedure; the machine will not

be operational over the next time unit. According to the convention we have chosen, the

range for values for d is . If we experience a failure at , by the time we

reach the end of the interval at n, we have already had 1 time unit of repair completed. This

is due to our assumption that the failure occurs at the beginning of the interval .

Therefore, time units remain. This approach slightly underestimates production

because the production output up to the instant of failure within the time interval is

ignored. An alternative approach is possible where we may consider the failure to occur at

the end of the time interval. This approach however, slightly overestimates the production

output. In this chapter, we choose the former approach and consider the effect to be

insignificant when the individual time interval chosen is short.

Page 143: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

127

The possible transitions for the individual variable are:

, but not all combinations of these

individual transitions are possible at any . Figure 5.1 displays a state space diagram for an

example where for a planning horizon of . As evident, each state can be

sufficiently described by relying strictly on the previous state(s) leading to it.

Figure 5.1. Typical state space for

The state probability at is based on: 1) all the states at that can lead to the state

at ; and 2) the path from all these previous states to . At the initial moment, for

simplicity, we assume an operational machine, with zero working age, and operators with

an initial set of skills but no work experience. The more general case, where i and a are non-

zero, is presented in Appendix F.

Let be the probability of stage n having i operational intervals thus far, on a

machine that has a current working age of a, with d time units remaining in the repair

interval. The initial condition at is expressed as follows:

(0,0,0)

(0,0,2) (0,0,1) (0,0,0)

(0,0,2) (0,0,1) (0,0,0) (0,0,2)

(1,1,0)

(1,1,0)

(1,0,2) (1,0,1) (1,0,0)

(2,2,0)

(2,0,2) (2,0,1)

(3,3,0) (3,0,2)

(4,4,0)

(1,1,0)

(1,0,2) (1,0,1) (1,0,0) (2,1,0) (3,2,0) (4,3,0)

(2,2,0)

(2,0,2) (2,0,1) (2,0,0) (3,1,0) (4,2,0)

(3,3,0)

(3,0,2) (3,0,1) (3,0,0) (4,1,0)

(4,4,0)

(4,0,2) (4,0,1) (4,0,0)

(5,5,0)

(5,0,2) (5,0,1)

(6,6,0) (6,0,2)

(7,7,0)

Survival Failure Repair

Page 144: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

128

1,  0

 0, otherwise

i a d

.

We can progress recursively to calculate the state probability at each future state. The

recursive function expresses the state and the transition probability. The transition

probability of failure, q, is calculated using the PHM discussed in the previous Section. The

factors pertinent to the PHM are the number of operational time units, i, (affecting

operator skill), as well as working age, a. Global age, n, does not play a direct role in the

calculation of the hazard rate. As a result, , representing the probability of

failure while transitioning from a state at n, with conditions and , to a state at n+1, is

expressed as . The recursive formula representing the transition probabilities of the

proposed Markov chain is as follows:

0

( ,0, 1 1), 0 0 1 ,

( , , ) ( , ) ( , ,0 1), 0 1, ,

1 ( 1, 1) ( 1, 1,0 1), 0 0,

i

b

p i d n a d D

p i a d n q i b p i b n a d D a i n N

q i a p i a n a d

Where, is the probability of failure and calculated as follows, using the discussions in

Section 3.4:

,

The first term of the recursive formula represents the ‘repair scenario’. In this scenario,

the working age, a¸ is reset to zero and remains at zero until the repair is completed and the

machine is operational again. During the repair period which lasts D time units, the value of

d is reduced by 1 time unit as the machine progresses over subsequent stages.

Page 145: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

129

The second term of the recursive formula represents the ‘failure scenario’. This is where

the machine starts a new repair interval. As such, at stage n, working age, a, is set to zero

and repair time remaining is set to . This means the machine is operational just before

stage and can be in any state with with probability and fails in

with probability .

The last term of the recursive function represents the ‘survival scenario’. The fact that

the machine is operational at , with , and means that in the previous stage,

, the machine was operational or it had just finished a repair interval. Therefore, the

state probability at is and the machine did not fail over

with probability .

We continue to use the recursive equations over the entire planning horizon until we

calculate the probabilities of all possible states at stage N, the end of the planning horizon.

We have obtained some properties of the recursive formula that can reduce the

calculations required over the entire horizon. These properties are presented in Appendix E.

At the final stage, , we are interested in the expected output value at each level of

i. This is calculated by multiplying the state probability and the output level at the particular

i value. The output level, given i operational time units, can be calculated by a forecasting

method, such as regression. In doing so, operator characteristics can be considered as

independent variables in the regression equations. The machine’s production output level,

, represents the dependent variable, given expertise gained over i previous operational

Page 146: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

130

time units. represents the production output for a single time unit only. If there is no

learning involved, can be represented by a constant value, .

We introduce to represent the total production output over i operational time units

by . Then in the case of a constant value, , the total expected output

over the horizon, , would be the product of and the expected number of total

operational time units over the planning horizon, , that is .

But we do consider operator learning in our analysis. This means we may have cases

where , for any i. To calculate each , we use the learning curves to

forecast the expertise levels for the appropriate number of operational time units, i, that

has helped the operator gain knowledge.

At N, we are only interested in i. The probabilities of the possible states for each i are

summed, . Then we can calculate the expected number of

operational time units over planning horizon: . This probability

distribution, , along with the output level, can be used to calculate the expected

total output value for a particular operator on a certain machine, over the planning horizon:

. It should be noted that in addition to calculating the expected

value, a decision maker may be interested to calculate some other characteristics such as

the variance of each operator’s production over the horizon, . A

smaller variance is certainly desirable for planning purposes. All else being equal, an

operator with a lower production variance is more desirable due to the resulting stability.

Page 147: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

131

The calculation of may be done in an alternate way. The aforementioned

calculation method uses , which is a cumulative term. At times, it may be easier or

more intuitive, to use the individual . When we think of the discretized planning

horizon, we see take on the following form:

.

This can be expressed as follows:

, where

and represents running at

least j intervals. For example, the term represents the production of parts in

two time intervals, multiplied by the probability of running at least two time intervals. For

the case with non-zero initial conditions, please refer to Appendix F.

5.3. Empirical Study

Once again, we apply the discussions in this chapter to the case study of Alpha. There can

be many applications for the model developed in Chapter 5, such as production output

maximization and maintenance cost minimization. Similar to the discussions in Chapter 4,

we consider the application of operator assignment optimization, with the aim of

production output maximization. The implementation of the Markov chain model, and the

resulting operator assignment, is more detailed, and thus accurate, method to follow for

operator assignment.

Page 148: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

132

In Section 3.6.6 in Chapter 3, using the extended data set of Alpha, we obtained the

following PHM:

.

In addition to the PHM obtained earlier, we obtained regression equations in Chapter 4 to

forecast the production output, in terms of operator skill level. We express Eq. (1) in

Chapter 4 as follows, in order to incorporate the role of i, one of the dimensions of our

Markov chain states:

,

where

: Output per unit time for machine , given i operational time units

, : number of machines

: coefficients of the main effects, as applicable to machine ,

: coefficients of the interaction terms, as applicable to machine ,

and : Indices iterating through the various operator characteristics considered.

, , where l is the total number of operator characteristics

considered.

Represents the value of operator characteristic , after i operational periods

Page 149: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

133

The actual regression equations remain unchanged, as expressed earlier in Table 4.1

and repeated here:

Driven 18.664 0 0.051 0.214 0 0

Drive 29.935 0 0 0.086 0.001 0

Ring 23.721 0.090 0 0 0 0.001

Table 5.1: Regression Equation Coefficients, significant at p < 0.01

Lastly, we will use the equations for the operator learning curves, as described earlier in

Table 4.2. These equations are displayed in Table 5.2; however, there is an additional

column here for the “Experience” component of operator skill. In Table 4.2, Experience was

expressed as a constant number, based on operator records of time in the department and

previous work in gear manufacturing. In Table, 5.2, we express Experience using a linear

function, based on the operator experience level at the beginning and the end of third

quarter.

Page 150: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

134

Operator Experience Level (1) Learning Curve and R2 obtained (2)

Social Interaction Analytical Skill

1 100 , 100

2 35.73+0.0298m 69.79

3 21.45+0.0298m 41.11

4 100.00 ,

5 35.73+0.0298m ,

6 35.73+0.0298m ,

7 100.00 ,

8 50.02+0.0298m ,

9 21.45+0.0298m ,

Table 5.2: Learning curves of operators’ skill components

Note (1): at the start and the end of the planning horizon, the experience level of the operator is known, with

certainty, based on his hiring date and transfer date into Alpha. A linear equation is used to forecast values in

between the end points, necessary for the Markov chain model.

Note (2): each equation is forecasted based on 6 points. In the cases where L is represented by a number, the

operator’s scores did not differ in the six points, or that the operator has reached the maximum value of 100

and has been capped off.

The planning horizon considered is the fourth quarter and the unit of time is hours.

Therefore, for each operator, based on 24-hours days, 5-day weeks, and 12-week

quarters. The model is developed using an MTTR of three hours: . Applying the

recursive function developed in Section 5.2, and using the PHM, regression equations, and

learning curves developed for Alpha, we calculate the expected production output of each

operator on each Kappa machine (Table 5.3).

Page 151: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

135

Operator # Expected Production Output

Driven machine Drive machine Ring machine

1 20,926 21,821 19,030

2 17,223 19,636 16,919

3 14,888 17,616 15,109

4 20,768 21,752 18,973

5 20,125 20,711 17,747

6 17,929 20,478 17,623

7 20,211 20,890 18,146

8 18,621 20,256 17,622

9 20,266 20,782 17,957

Table 5.3: Expected production output for each operator on each machine

As an application of our model, we perform an operator assignment optimization. We

take a mathematical programming approach and define the objective function as the total

revenue over the period. Revenue is maximized over the planning horizon by optimally

assigning operators to machines. The binary decision variable when operator is

assigned to machine (making product ) and is zero otherwise. This model is a simple OA

problem with each operator being assigned to one machine, and each machine having the

correct number of operators, 3, to run three shifts per day.

To date, the department has not used any decision criteria for its OA. Positions have

traditionally been filled on a random basis where operators are transferred to the

department and after the end of their training period, they are assigned to any machine not

fully staffed for all three shifts. Based on this random assignment scenario, we use the

expected operator output of our Markov chain approach, expressed in Table 5.3, to

calculate Alpha’s quarterly revenue. As an example, operators 1, 2, and 3 are assigned to

the Driven line. Therefore, in calculating the total revenue for the “Random” assignment,

the values used from Table 5.3 for operators 1, 2, and 3 are 20926, 17223, and 14888. The

values calculated for these three operators for the Drive and the Ring machines are ignored.

Page 152: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

136

We can also use the results of our Markov chain approach in an IP model for an optimal

OA that maximizes Alpha’s revenue. In addition to comparing our optimal IP approach to

the “random” model, we can compare it to a simple skill-ranking model where the DM

assigns the operators to the machines based on their current skill level, without accounting

for learning curves. Under this simple skill-ranking assignment policy, if operator A’s skill

scores are currently higher than operator B, operator A gets assigned to machine 1 whose

product has a higher sale price than machine 2. As an example, if operators 1 and 2 have

the skill set {71,66,93} and {71,61,90}, respectively, and given the sale price of $110 and $90

for the Driven and the Ring gear, respectively, operator 1 would get assigned to the Driven

machine and operator 2 to Ring. The learning curves of the two operators are not

considered. Consequently, this may be a sub-optimal decision if operator 2 has a steep

positively-sloped learning curve compared to operator 1’s negatively sloped learning curve

for Social and a flat curve for Analytical. In this case, it may be better to assign operator 2 to

Driven because he would produce more over the entire length of the quarter.

We use the expected production output from our Markov chain model (Table 5.3) to

calculate the quarterly revenue. In obtaining the result of our approach, we solve the model

using the LINGO software. The result of our model is compared to the other assignment

policies (Table 5.4) and the significant additional revenues are evident.

Assignment Policy Quarterly Revenue

($)

Revenue difference with optimal OA

($) (%)

Random 16,333,880 565,370 3.5

Simple skill ranking 16,197,900 701,352 4.3

Optimal assignment 16,899,250 - -

worst case 16,090,980 808,270 5.0

Table 5.4: Comparing Alpha’s revenue under various OA approaches

Page 153: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

137

We re-run our Markov chain model, this time ignoring the effect of learning. Skill scores

are treated as a flat value across the quarter. The operator assignments are different than

when learning is considered. System revenue is calculated to be $16,532,620; this is

$366,630, or 2.2% less revenue over the quarter. This is a relatively large difference in

revenue and points to better model results when we are able to include more information

in the model. The additional information included in the model in this case is the learning

effects.

A further use of our model is to use it to determine the effect of providing additional

training to an operator prior to allowing him to work independently on a machine. We

perform this sensitivity analysis by repeating the expected uptime calculation for the lowest

skilled operator, starting the planning horizon with an analytical skill score that is 10%

higher. This results in additional quarterly revenue of up to $54,158, or as low $8,400,

depending on which machine the operator is assigned to. This information can be used as a

cost-benefit analysis tool on providing machine-specific training to the operators.

5.3.1. Model Validation

Among the many applications of our model in Chapter 5, the particular one we have

discussed in this case study is optimal operator assignment. This is the same goal of the

model discussed in Chapter 4 and since both models are applied to the same data set, it

would be interesting to compare the results of the two approaches. Table 5.5 presents the

results of the two approaches for various operator assignments. The actual operator

assignments for the Random, Simple, and Optimal policies are the same for the two

Page 154: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

138

approaches. The assignments are different for the worst case scenario. But regardless of

the actual operator assignments, applying the two approaches to the same data set

provides very similar revenue forecasts and this is promising. This comparison of one

approach to the other serves to validate each approach.

Assignment Policy Chapter 5: Quarterly Revenue ($)

Chapter 4: Quarterly Revenue ($)

Difference between results of two approaches

Random 16,333,880 16,232,450 0.6%

Simple skill ranking 16,197,900 16,743,970 -3.4%

Optimal assignment 16,899,250 17,027,400 -0.8%

worst case 16,090,980 16,077,900 0.1%

Table 5.5: Comparing obtained revenue under approaches of Chapters 4 and 5

Upon going through the empirical work presented in Chapters 4 and 5, one may wonder

about the usefulness of Chapter 4 when optimal operator assignment can be achieved with

Chapter 5’s Markov chain model as well. The approach discussed in Chapter 4 uses

deterministic models and as such, it is less complex. In comparison, the Markov chain model

in Chapter 5 is quite complex, with the calculation of the probability of failure at every

interval along the planning horizon. It takes a regular computer about 10 hours to produce

the result of the case study presented in Chapter 5. In the end, both approaches provide

the DM with the optimal operator assignment; but the usage of one approach over the

other is a trade-off between accuracy and simplicity.

For the purposes of validating our model in this chapter, we can also take on a “data-

split” approach. Previously, we had looked at the entire data set covering January 1 to

October 9. The PHM, the learning curve equations, and the regression equations were

Page 155: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

139

obtained from the data covering this period. For the data split exercise, we consider the

period between January 1 and August 15. Using this shorter period, we obtain new PHM

and regression equations and use them to re-run the Markov chain model. We can then

compare the results with actual production results for the period between August 18 and

October 9 which is unused for the model developments.

Using the data for the period January 1 to August 15, the following new PHM and

regression equations are obtained. In order to keep the model simple, interaction terms are

not considered.

,

where : Social, : Analytical, : , : , : , : , and : .

,

,

,

where : experience, : social, and : analytical. As can be expected, the models based

on seven months of data are slightly different than the equations obtained using the entire

nine months of the data set (with no interaction terms considered), which are the following:

,

Page 156: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

140

where : Social, : Analytical, : , : , : , : , and : .

,

,

,

where : experience, : social, and : analytical.

We do not randomize the data and consider all but the last eight weeks of the data,

equivalent to about 75% of the total data set. We still want to consider the effect of

learning and, as a result, we cannot randomize the data by date. Furthermore, since the skill

assessments occur in January, April and July, the learning curve equations do not need to be

re-established.

Looking at the actual data records, we determine the number of hours worked by each

operator and use the equivalent number of hours obtained from the Markov chain model.

Based on these operator hours, the production output is presented in table 5.6:

Operator Driven Drive Ring

Model Actual Model Actual Model Actual

1 11,479 11,848

2 9,288 10,094

3 8,623 9,056

4 11,013 11,851

5 11,025 12,440

6 10,302 9,899

7 8,603 8,498 8 9,036 8,911 9 9,037 8,925

Table 5.6: Comparing model results with actual production volumes

Page 157: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

141

During the regular course of production, the operators were only assigned to one

machine and did not work on the others. As an example, an operator working on the Ring

machine worked on the Ring machine for the entire 9-month period. Therefore, when we

compare the Markov chain results with the actual ones, we are left with many blank cells.

But the nine comparison results we have obtained are quite promising. For example, in

comparing Operator 1’s forecasted versus actual production output, the difference is only

3.2%. When we compare the two sets of model versus actual, we obtain a Pearson R2 of

0.917, significant at the 0.01 level (two-tailed).

5.4. Concluding Remarks and Future Work

We have developed a Markov chain approach to forecast the production output of a

human-machine system, considering human-related factors as well as the learning of the

operators. A planning horizon is considered and discretized; each time interval can have

multiple states, for which a state space is defined. Through the variables defined to

represent the state space, we can uniquely identify each state. The probability of each state

can be calculated from the states immediately before it, regardless of what has occurred

previously. At the end of each time interval, the machine’s status may proceed in one of

two ways. If it is in repair following a failure, it will remain in repair for the duration of the

repair period. If the machine is not in repair, two outcomes are possible: failure and

survival; there is a transition probability associated with each.

We calculate the probability of failure using a proportional hazards model that calculates

the hazard rate based on the machine working age as well as operator-related covariates.

Page 158: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

142

With time, an operator gains experience on the machine and his expertise is likely to

improve, leading to the possible reduction of human error on machine operation. Using the

learning curves for each operator, we calculate the appropriate hazard rate for each level of

cumulative operational periods.

Using a recursive formula and a Markov chain approach, we proceed from one time

interval to the next throughout the entire planning horizon. At the end of the horizon, all

the possible states, along with their probabilities, are used to calculate the expected uptime

over the entire horizon. This quantity, along with the production output for each state, is

used to calculate the expected production output over the entire planning horizon. The

production output for each state is calculated through regression equations, forecasting the

production output in terms of operator skills. Once again, the learning curves are

considered and the appropriate operator skill values are used.

Our work can have several applications. To demonstrate our model and one of its

possible applications, we have discussed the case study of operator assignment

optimization in a manufacturing organization. We calculate the expected output of each

operator on every machine and use these quantities as input in the objective function of a

linear programming model. We optimize this assignment problem to maximize system

revenue. This maximized revenue is compared to revenue obtained based on the current

random assignment practiced in the company as well as an assignment solely based on the

current level of skill, disregarding operator learning.

Page 159: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

143

Other than the general future work to consider additional HR factors, a possible future

work stemming from the work in this chapter is to consider manufacturing environments

where there are different machine types. This can lead to an additional complexity of

operators having different learning curves on different machines. As the operators work on

different machines and learn new tasks, there may exist a “forgetting” effect for the tasks

they have learned on the machines they have worked on previously. Operator learning and

forgetting is a concept that has been studied in the literature and it will be interesting for us

to consider it within our context.

Another future work is to extend the assumption of a fixed duration for all repairs and

have random repair durations as well. There can be a failure distribution to draw from. In

doing so, we introduce greater complexity in the model but make the model more realistic.

Alternatively, we can stay with the fixed duration for a repair, but consider several repair

scenarios. We can select the top five most common failure modes, have an MTTR for each,

add one more dimension to the Markov chain state for the type of failure, and implement

the same type of analysis described in this chapter. Similar to the other aforementioned

future works, the consideration of several failure modes will yield a more realistic analysis

scenario, resulting in a more accurate estimate of production output.

Similar to Chapter 4, another important future work can be determination of the length

of the planning horizon to consider. Once again, in this chapter, we leave the DM to

determine the length of the planning horizon. In the empirical study, the duration is taken

to be three months, aligned with the quarterly operator assessment cycles. One can work

Page 160: System Performance Analysis Considering Human-related Factors

EFFECTS OF OPERATOR LEARNING ON PRODUCTION OUTPUT

144

on providing a systematic tool for choosing an optimal length for the planning horizon

based on machine performance or operator learning.

An additional interesting future work would be to loosen the assumption that operator

learning only takes place while the machine is running. It would be interesting to consider

an environment where operator-driven reliability is practiced and that the operators are

involved in maintenance work. In such a scenario, the operators may gain expertise in

machine operation during the up periods and learn troubleshooting skills during the down

periods. This consideration results in operators gaining various kinds of expertise at all

times when assigned to a machine. The outcome of this scenario can be compared to the

scenario we have considered in this chapter where learning only occurs while the machine

is up.

Page 161: System Performance Analysis Considering Human-related Factors

145

6. CONCLUSION

In this concluding chapter, the central ideas and contributions of the dissertation are

brought together to present a single body. We then proceed to provide the reader with a

recap of the discussions.

6.1. Central Ideas and Contributions

In the introductory chapter, we presented the following figure as the general outline of

the discussions in this dissertation (Figure 1.1).

Our aim has been the analysis and improvement of the performance of human-machine

systems. The central idea of our work is to: 1) develop mathematical models to determine

what, if any, effect exists between various human-related factors and system performance;

and 2) use the models as a systematic tool to make the optimal decisions, given the

decision criteria, such as revenue or system availability.

Figure 1.1: General framework of topics in the dissertation

Quantification

Analysis of failure

risk: short term

intervention

Human-related Factors

Performance

of human-

machine

systems

Optimal operator

assignment

Long term planning

Production

forecasting: risk of

failure and operator

learning

Page 162: System Performance Analysis Considering Human-related Factors

CONCLUSION

146

Our first contribution is to present a novel method to quantify the effects of human-

related factors on the risk of failure in manufacturing industries. There exists a gap in the

literature about the lack of performance measurement models that incorporate human-

related risk assessments; we aim to fill this gap. We choose to work with the proportional

hazards model, a common and versatile tool in condition-based maintenance, and include

human-related factors as covariates. Using the PHM in this manner provides us with a

predictive technique for human reliability.

When failures can be caused by operators, the decision-maker must intervene to

mitigate operator-related risk. There can be numerous intervention methods possible; our

next contribution is to develop a revenue model that provides the decision-maker with a

systematic tool to perform a cost-benefit analysis, balancing the advantage of risk

reduction, against the direct cost of the intervention method. As a result, the DM can

choose the revenue-maximizing intervention method for reducing the probability of failure

stemming from the operators.

The cost-benefit analysis is based on a revenue model that uses the expected uptime

and the probability of machine failure, given the human-related factors. The revenue model

can be used to calculate the failure risk threshold, above which positive revenue is not

expected. Therefore, this risk threshold also serves as a profitability boundary. We use this

boundary to calculate backwards and determine the minimum levels necessary for the

human-related factors. This is another important contribution of our work. An example of

its managerial impact is the possibility of its usage in the certification process for novice

Page 163: System Performance Analysis Considering Human-related Factors

CONCLUSION

147

operators before they are released from the training facility and assigned a machine to

operate independently.

We then expand our focus on system performance by analyzing machine uptime in

addition to the efforts thus far on downtime reduction. There exists a relationship between

operator-related characteristics and machine performance. As an example, a more skilled

operator is likely to have a higher output on the machine. We acknowledge this relationship

to be dynamic as a result of operator learning. We present a method to forecast the

production output, considering this relationship. This Markov chain-based method

incorporates the previously described contributions, and builds on them to achieve the aim

of forecasting the output of a stochastic system where uptime and downtime are both

dependent on operator factors and learning.

The contribution of this work can have significant managerial impact where we can

provide a return-on-investment analysis for training cost versus additional revenue. The

additional revenue would be gained as a result of more output produced by the operators

whose skill is improved due to training. This is an important contribution and can add value

in the interface of operations management and human resource management.

The best way a company can use the research presented in this dissertation would be to

apply the framework of Chapter 4 first to optimally assign the operators. Once the

operators are optimally assigned to the machines, the company would apply the framework

of Chapter 3 to deal with the HR risk present in the system in the short term. The DM would

then apply the model developed in Chapter 5 to forecast each operator’s output for

Page 164: System Performance Analysis Considering Human-related Factors

CONCLUSION

148

planning purposes on an individual basis, such as a cost-benefit analysis of being enrolled in

a training program, or on a system-level, such as the ordering of production material. In

whichever order the framework is applied, the work presented in this dissertation allows a

decision maker to include the role of the human participants, often one of the most

important resources of an organization, in decision making to improve system performance.

6.2. Concluding Remarks and Recap

When we are interested in improving the uptime of a system, whether this entails

increasing the output directly, or decreasing the downtime, all contributing aspects of the

system should be analyzed. Most systems function as a combination of human participants

and hardware. As such, it is necessary to include the role of the human participant in any

kind of analysis in our uptime improvement attempt.

The work presented in this thesis has focused on the risk of system failure, stemming

from the human participants. The human participants can have various roles within the

system. Examples of such participants are managers devising the strategy; they can be the

schedulers in charge of production plans or material purchasing; or they can be

maintenance trade people. The only type of human participants considered in this

dissertation is those with a direct role to play on system failure. In spite of the applicability

of the ideas in this dissertation to different industries, there has been a stronger emphasis

on the manufacturing industry. This is especially evident in the case study discussed

throughout. As such, the example that can be provided for the type of human participant

Page 165: System Performance Analysis Considering Human-related Factors

CONCLUSION

149

considered in this dissertation is a machine operator whose errors can result directly in a

machine failure.

Despite the multidisciplinary nature of the work presented in this dissertation, the

dominant field is still operations research. Therefore, statistical and mathematical models

for performance optimization have a major role to play. For us to be able to implement

human-related factors into these models, they have to be quantified. The quantification

process may be a challenge and various methods to quantify factors that have traditionally

been thought of as subjective, such as skill, have been discussed in Chapter 2. A few of the

well-known human reliability techniques, such as THERP and CREAM, are discussed as

alternatives, before we discuss our methodology. It is based on the framework of the

Critical Incident Technique and uses system experts to assess the skills of the operators.

According to prior works in the literature, it is prevalent to evaluate performance based on

expert judgement. Where experts are knowledgeable and available, the results are reliable.

In chapter 3, we perform a failure risk analysis for short term, such as one shift, and

discuss intervention methods a decision-maker can take in order to reduce operator-

initiated risk. The tool used as a model to calculate the risk of failure is the proportional

hazards model. This tool is used to build a model that aids the decision maker in selecting

the intervention method that is most beneficial to the system. Each intervention method

can reduce the risk by some degree and has a direct cost associated with it. The model we

develop presents the decision maker with a way to perform a cost-benefit analysis. Another

usage of the model is for the calculation of the minimum level of various factors in order to

Page 166: System Performance Analysis Considering Human-related Factors

CONCLUSION

150

ensure system profitability. In the context of our discussions, the decision maker may use

the model to determine a certain level of skill that an operator should be “certified” at

before he/she can start to work independently in the production environment.

Chapter 4 presents an approach for the optimal assignment of operators to machines. In

doing so, there are two main factors we consider. The first is the degree of sensitivity of

each machine to human-related factors. The second is operator learning. The approach is

deterministic and the probability of machine failure is not considered; operators are

assigned based on their current and forecasted characteristics. For each machine,

production output should be forecasted in terms of the factors affecting the operator

working on the machine. Of the various methods that may be available, we choose to use

regression analysis. To capture the effect of learning, we use historical data to build

learning curve equations for the operators. We can then incorporate the regression

equations and the learning curves into a unified model that can forecast the output of each

operator on each machine. The values from these outputs are used in a linear programming

model to solve the assignment problem. The framework we develop is applied to a case

study to show the savings that can be realized if our model is implemented.

The work presented in Chapter 5 is used to forecast the production out over a planning

horizon. The planning horizon considered is long term and we divide into small time

intervals. Analyzing the risk of machine failure is performed over each interval. Similar to

Chapter 3, the proportional hazards model is used for this failure analysis. The main tool

used to develop the framework of the chapter’s model is a Markov chain. The model

Page 167: System Performance Analysis Considering Human-related Factors

CONCLUSION

151

provides the decision maker with a prediction of the production output of each operator on

each machine. The decision maker can use this model directly for planning purposes or

he/she can perform sensitivity analysis to determine what benefits may be gained by

making changes to the current state. An example of a direct usage is optimal operator

assignment. An example of sensitivity analysis is forecasting a production output increase as

a result of a training program’s skill improvement.

Page 168: System Performance Analysis Considering Human-related Factors

152

REFERENCES

1. Adamides, E. D., Stamboulis, Y. A., Valeris, A. G. (2004). Model-based Assessment of

Military Aircraft Engine Maintenance Systems. Journal of the Operational Research

Society, 55: 957-967.

2. Agrote, L., Insko, C. A., Yovetich, N., Romero, A. A. (1995). Group Learning Curves:

the Effects of Turnover and Task Complexity on Group Performance. Journal of

Applied Social Psychology, 25(6): 512-529.

3. Allen, T. J. (1977). Managing the Flow of Technology. MIT Press: Boston, MA, USA.

4. Ash, R.A., Levine, E.L., (1985). Job Applicant Training and Work Experience

Evaluation: An Empirical Comparison of Four Methods. Journal of Applied

Psychology, 70: 572-576.

5. Baines, T. S., Asch, R., Hadfield, L., Mason, J. P., Fletcher, S., and Kay, J. M. (2005).

Towards a Theoretical Framework for Human Performance Modeling Within

Manufacturing Systems Design. Simulation Modeling Practice and Theory, 13: 486–

504.

6. Barroso, M., and Wilson, J. (1999). HEDOMS – Human Error and Disturbance

Occurrence in Manufacturing Systems: Towards Development of an Analytical

Framework. Human Factors and Ergonomics in Manufacturing, 9 (1): 87–104.

7. Bendig, A.W., (1952a). A Statistical Report on a Revision of the Miami Instructor

Rating Sheet. Journal of Educational Psychology, 43: 423-429.

Page 169: System Performance Analysis Considering Human-related Factors

REFERENCES

153

8. Bendig, A.W., (1952b). The Use of Student Rating Scales in the Evaluation of

Instructions in Introductory Psychology. Journal of Educational Psychology, 43: 167-

175.

9. Bendig, A.W., (1953). Reliability of Self-Ratings as a Function of the Amount of

Verbal Anchoring and of the Number of Categories on The Scale. Journal of Applied

Psychology, 37: 38-41.

10. Bendoly, E., Prietula, M., (2008). In “the zone”: The role of evolving skill and

workload on motivation and realized performance in operational tasks. Journal of

Operations & Production Management, 28 (11-12), 1130–1152.

11. Biskup, D., (2008). A State-of-the-art Review on Scheduling with Learning Effects.

European Journal of Operational Research, 188: 315-329.

12. Blanks H (2007). Quality and Reliability into the Next Century. Quality and Reliability

Engineering International, 10(3): 179-184.

13. Blau, F. D., Kahn, L. M. (1996). International Differences in Male Wage Inequality:

Institutions versus Market Forces. Journal of Political Economy, 104(4): 791-837.

14. Bluhm, K. (2001). Exporting or Abandoning the `German Model'?: Labour Policies of

German Manufacturing Firms in Central Europe. European Journal of Industrial

Relations, 7(2): 153-173.

15. Blumenfeld, D. E., and Inman, R. R. (2009). Impact of Absenteeism on Assembly Line

Quality and Throughput. Production and Operations Management, 18 (3): 333-343.

Page 170: System Performance Analysis Considering Human-related Factors

REFERENCES

154

16. Blumenfeld, P. C., Marx, R. W., Soloway, E., Krajcik, J., (1996). Learning with Peers:

From Small Group Cooperation to Collaborative Communities. Educational

Researcher, 25(8): 37-40.

17. Borman, W. C., Dunette, M. D., (1974). Behavior-based Versus Trait-Oriented

Performance Ratings: An Empirical Study. Journal of Applied Psychology, 60: 561-

565.

18. Bowerman, B. L., O’Connel, R. T., (1990). Linear Statistical Models: An Applied

Approach (2nd edition). Druxbury: Belmont, CA, USA.

19. Brown, A. L., Palincsar, A. S., (1989). Guided, cooperative learning and individual

knowledge acquisition. In L. B. Resnick (Ed.), Knowing, learning, and instruction:

Essays in honor of Robert Glaser. Erlbaum: Hillsdale, NJ, USA.

20. Bryson, A., Forth, J., (2007). Are There Day of the Week Productivity Effects? Centre

for Economic Performance, Manpower Human Resources Lab: Document number

MHRLdp004, London School of Economics.

21. Bubb. H., (2005). Human Reliability: A Key to Improve Quality in Manufacturing.

Human Factors and Ergonomics in Manufacturing, 15: 353-363.

22. Burkolter, D., Kluge, A., Sauer, J., Ritzmann, S., (2009). The Predictive Qualities of

Operator Characteristics for Process Control Performance: the Influence of

Personality and Cognitive Variables. Ergonomics, 52 (3): 302-311.

23. Burnham, K. P., Anderson, D. R., (2004). Multimodel Inference: Understanding AIC

and BIC in Model Selection. Sociological Methods and Research, 33: 261-304.

Page 171: System Performance Analysis Considering Human-related Factors

REFERENCES

155

24. Butterfield, L. D., Borgen, W. A., Amundson, N. E., Maglio, A. T. (2005). Fifty Years of

the Critical Incident Technique: 1954-2004 and Beyond. Qualitative Research, 5: 475-

496.

25. Cacciabue, P. C. (2005). Human Error Risk Management Methodology for Safety

Audit of a Large Railway Organisation. Applied Ergonomics, 36(6): 709-718.

26. Cacciabue, P.C., (2000). Human Factors Impact on Risks Analysis of Complex

Systems. Journal of Hazardous Materials, 71: 101-116.

27. Carr, M. J., Christer, A. H. (2003). Incorporating the Potential for Human Error in

Maintenance Models. Journal of the Operational Research Society, 54(12): 1249-

1253.

28. Castanier, B., Berenguer, C., Grall, A., (2003). A Sequential Condition-based

Repair/Replacement Policy with Non-periodic Inspections for a System Subject to

Continuous Wear. Applied Stochastic Models in Business and Industry. 19 (4), 327-

347.

29. Centrone, D., Kiassat, C., Garetti, M., Banjevic, D., Jardine, A. K. S. (2010).

Proportional Hazards Model: A Valuable Methodology for Sustainable

Manufacturing. Proceedings of Maintenance for Sustainable Manufacturing (M4SM)

conference, Verona, Italy. 51-56.

30. Chang, Y. H., Wang, Y. C., (2010). Significant Human Risk Factors in Aviation

Maintenance Technicians. Safety Science, 48: 54-62.

31. Cohen, J. (1990). Things I have learned (so far). American Psychologist, 45 (12), 1304-

1312.

Page 172: System Performance Analysis Considering Human-related Factors

REFERENCES

156

32. Cohen, J. (1994). The earth is round (p < 0.05). American Psychologist, 49 (12), 997-

1003.

33. Conway, J. M., Huffcutt, A. I. (1997). Psychometric Properties of Multisource

Performance Ratings: A Meta-Analysis of Subordinate, Supervisor, Peer, and Self-

Ratings. Human Performance, 10(4): 331-360.

34. Crowder, M., (2012). Multivariate Survival Analysis and Competing Risks. Taylor &

Francis Group, Boca Raton, FL, USA.

35. Davis, P. J. (2006). Critical Incident Technique: A Learning Intervention for

Organizational Problem Solving. Development and Learning in Organizations, 20(2):

13-16.

36. Davis, D. A., Mazmanian, P. E., Fordis, M., Van Harrison, R., Thorpe, K. E., Perrier, L.,

(2006). Accuracy of Physician Self-assessment Compared With Observed Measures

of Competence. Journal of American Medical Association, 296(9): 1094-1102.

37. Dhillon, B., Liu, Y. (2006). Human Error in Maintenance: A Review. Journal of Quality

in Maintenance Engineering, 12(1): 21-36.

38. Dutton, J. M., Thomson, A., (1984). Treating Progress Functions as a Managerial

Opportunity. The Academy of Management Review, 9(2): 235-247.

39. Elmaraghi, W. H., Nada, O. A., Elmaraghi, H. A., (2008). Quality Prediction for

Reconfigurable Manufacturing Systems via Human Error Modelling, 12(5): 584-598.

40. Embrey, D.E. (1986). SHERPA: A Systematic Human Error Reduction and Prediction

Approach. International Topical Meeting on Advances in Human Factors in Nuclear

Power Plant Systems, Knoxville, Tennessee.

Page 173: System Performance Analysis Considering Human-related Factors

REFERENCES

157

41. Emrouznejad, A., Zerafat Angiz, L. M., Ho, W., (2012). An Alternative Formulation for

the Fuzzy Assignment Problem. Journal of the Operational Research Society. 63(1):

59-63.

42. Feiring, B.R., (1993). A Model Generation Approach to the Personnel Assignment

Problem. Journal of Operational Research Society, 44(5): 503-512.

43. Fiedler, F.E., (1970). Leadership Experience and Leader Performance: Another

Hypothesis Shot To Hell. Organizational Behavior and Human Performance, 5: 1-14.

44. Field, A., (2005). Discovering Statistics Using SPSS (2nd edition). Sage Publications,

London, UK.

45. Flanagan, J., (1954). The Critical Incident Technique. Psychology Bulletin, 51: 327-

358.

46. Fritzsche, R., (2012). Cost Adjustment for Single Item Pooling Models Using a

Dynamic Failure Rate: A Calculation for the Aircraft Industry. Transportation

Research Part E, 48: 1065-1079.

47. Gasmi, S., Love, C. E., Kahle, W.,, (2003). A General Repair, Proportional-Hazards,

Framework to Model Complex Repairable Systems. IEEE Transactions on Reliability.

52(1), 26-32.

48. Glick, W.H., Jenkins, G.D., Jr., Gupta, N., (1986). Method Versus Substance: How

Strong Are Underlying Relationships between Job Characteristics and Attitudinal

Outcomes? Academy of Management Journal, 29: 441-464.

Page 174: System Performance Analysis Considering Human-related Factors

REFERENCES

158

49. Haas, C.T., Morton, D.,P., Tucker, R.L., Gomar, J.E., Terrien, R.K., (2000). Assignment

and Allocation Optimization of Partially Multiskilled Workforce. Center for

Construction in Industry Studies, Report 13.

50. Hancock, P. A., (1986). Stress and adaptability. G.R.J. Hockey, A.W.K. Gaillard, M.G.H.

Coles, eds., Energetics and Human Information Processing. Martinus Nijjhoff,

Dordrecht, The Netherlands, 243–251.

51. Hannaman, G. W., Spurgin, A. J., (1984). Systematic Human Action Reliability

Procedure (SHARP). EPRI NP-3583, Project 2170-3, Interim Report, NUS

Corporation, San Diego, CA, US.

52. Harvey, R.J., Billings, R.S., Nilan, K.J., (1985). Confirmatory Factor Analysis of the Job

Diagnostic Survey; Good News and Bad News. Journal of Applied Psychology, 70:

461-468.

53. Hollnagel, E., (1998). Cognitive Reliability Error Analysis Method (CREAM). Elsevier

Science: New York, USA.

54. Horberry, T. J., Burgess-Limerick, R., Steiner, L., (2010). Human Factors for the

Design, Operation, and Maintenance of Mining Equipment. Taylor and Francis

Group: Boca Raton, FL, USA.

55. Hsie, M., Hsiao, W., Cheng, T., and Chen, H., (2009). A Model Used in Creating a

Work-Rest Schedule for Laborers. Automation in Construction, 18(6): 762–769.

56. Hunter, J.E., Hunter, R.E., (1984). Validity and Utility of Alternative Predictors of Job

Performance. Psychological Bulletin, 96: 72-98.

Page 175: System Performance Analysis Considering Human-related Factors

REFERENCES

159

57. Iakovou, E., Ip, C. M., Koulamas, C., (1999). Throughput-dependent Periodic

Maintenance Policies for General Production Units. Annals of Operations Research,

91: 41-47.

58. Jardine, A. K. S., Ralston, P., Reid, N., Stafford, J., (1989). Proportional Hazards

Analysis of Diesel Engine Failure Data. Quality and Reliability Engineering

International, 5(3): 207-216.

59. Jardine, A. K. S., Banjevic, D., (2005). Interpretation of Inspection Data Emanating

from Equipment Condition Monitoring Tools: Method and Software, in

Mathematical and Statistical Methods in Reliability, Armijo, Y. M, (Ed) World

Scientific Publishing Company: Singapore.

60. Jardine, A. K. S., Buzacott, J. A., (1985). Equipment Reliability and Maintenance.

European Journal of Operational Research, 19 (3): 285-296.

61. Jardine, A. K. S., Banjevic, D., Makis, V., (1997). Optimal Replacement Policy and

Structure of Software for Condition-Based Maintenance. Journal of Quality in

Maintenance Engineering, 3: 109-119.

62. Jenkins, G.D., Taber, T.A., (1977). A Monte Carlo Study of Factors Affecting Three

Indices of Composite Scale Reliability. Journal of Applied Psychology, 62: 392-398.

63. Kao, C., Lee, H.T., (1996). An Integration Model for Manpower Forecasting. Journal

of Forecasting, 15: 543-548.

64. Karaulova, T., Pribytkova, M., (2009). Reliability Prediction for Man-Machine

Production Lines. DAAAM International Scientific Book. DAAAM International

Vienna, 487-500.

Page 176: System Performance Analysis Considering Human-related Factors

REFERENCES

160

65. Kariuki, S. G., Loewe, K., (2007). Integrating Human Factors into Process Hazard

Analysis. Reliability Engineering and System Safety, 92: 1764–1773.

66. Kiassat, C., Safaei, N., (2009). Integrating Human Reliability Analysis into a

Comprehensive Maintenance Optimization Strategy. Proceedings of the World

Congress on Engineering Asset Management, Athens, Greece. 561-566.

67. Kim, J. W., Jung, W., (2003). A Taxonomy of Performance Influencing Factors for

Human Reliability Analysis of Emergency Tasks. Journal of Loss Prevention in the

Process Industries, 16, 479-495.

68. Kim, M. C., Seong, P., H., Hollnagel, E., (2004). A Probabilistic Approach for

Determining the Control Mode in CREAM. Reliability Engineering and System Safety,

91, 191-199.

69. Knauth, P. (1996). Designing Better Shift Systems. Applied Ergonomics, 27(1): 39–44.

70. Kolarik, W. J., Woldstad, J. C., Lu, S., (2004). Human Performance Reliability: On-line

Assessment Using Fuzzy Logic. IIE Transactions, 36(5): 457-467.

71. Kostreva, M., McNelis, E., Clemens, E. (2002). Using a Circadian Rhythms Model to

Evaluate Shift Schedules. Ergonomics, 45(11), 739–763.

72. Lam, T. C. M., Klockars, A. J. (1982). Anchor Point Effects on the Equivalence of

Questionnaire Items. Journal of Educational Measurement, 19(4): 317-322.

73. Lamond, N., Dorian, J., Burgess, H. J., Holmes, A. L., Roach, G. D., McCulloch, K.,

Fletcher, A., Dawson, D., (2004) Adaptation of Performance During a Week of

Simulated Night Work. Ergonomics, 47(2): 154-165.

Page 177: System Performance Analysis Considering Human-related Factors

REFERENCES

161

74. Landy, F.J., Trumbo, D.A., (1975). Psychology of Work Behavior. Dorsey Press:

Homewood, IL, USA.

75. Leopairote, K. (2003). Policies for Multi-Skilled Worker Selection, Assignment, and

Scheduling. Doctoral Dissertation, University of Wisconsin-Madison.

76. Levine, E.L., Ash, R.A., Bennett, N., (1980). Evaluation and Use of Four Job Analysis

Methods for Personnel Selection. Journal of Applied Psychology, 65: 524-535.

77. Levine, E. L., Ash, R. A., Hall, H., Sistrunk, F., (1983). Evaluation of Job Analysis

Methods by Experienced Job Analysts. Academy of Management Journal, 26(2): 339-

348.

78. Li, C-L., Cheng, T. C. E., (1994). An Economic Production Quantity Model with

Learning and Forgetting Considerations. Production and Operations Management

3(2): 118-132.

79. Li., N., Li, X. L., (2000). Modeling staff Flexibility: a Case of China. European Journal of

Operational Research, 124(2): 255-266.

80. Lissitz, R.W., Green, S.B., (1975). Effect of the Number of Scale Points on Reliability:

A Monte Carlo Approach. Journal of Applied Psychology, 60: 10-13.

81. Lugtigheid, D., Banjevic, D., Jardine, A. K. S. (2004). Modelling Repairable System

Reliability with Explanatory Variables and Repair and Maintenance Actions, Journal

of Management Mathematics, 15(2): 89-110.

82. Malhotra, M. K., Fry, T. D., Kher, H. V., Donohue, J. M., (1993). The Impact of

Learning and Labor Attrition on Worker Flexibility in Dual Resource Constrained Job

Shops. Decision Sciences, 24(3): 641-663.

Page 178: System Performance Analysis Considering Human-related Factors

REFERENCES

162

83. Nembhard, D. A., Norman, B. A., (2002). Worker Efficiency and Responsiveness in

Cross-Trained Teams. Technical Report 02-02, Department of Industrial Engineering,

University of Pittsburgh.

84. Onkham, W., Karwowski, W., Ahram, T. Z., (2012). Economics of Human

Performance and Systems Total Ownership Cost. Work, 41: 2781-2788.

85. Ornstein, M., (1998). Questionnaire Design. Current Sociology, 46(4): 7-47.

86. Parker, S. K., Wall, T. D., Cordery, J. L., (2001). Future Work Design Research and

Practice: Towards an Elaborated Model of Work Design. Journal of Occupational and

Organizational Psychology, 74: 413–440.

87. Peng, Y., Dong, M., (2011). A Prognosis Method Using Age-Dependent Hidden Semi-

Markov Model for Equipment Health Prediction. Mechanical Systems and Signal

Processing, 25(1): 237-252.

88. Philipose, S., (1993). R&D Manpower Forecasting For Chemical Industries in India.

IEEE Transactions on Engineering Management, 40: 187-191.

89. Pillay, A., Wang, J., (2003). Modified Failure Mode and Effects Analysis Using

Approximate Reasoning. Reliability Engineering and System Safety, 79: 69-85.

90. Podsakoff, P.M., Organ, D.W., (1986). Self-reports in Organizational Research:

Problems and Prospects. Journal of Management, 12(4): 531-544.

91. Quinones, M.A., Ford, J.K., Teachout, M.S., (1995). The Relationship between Work

Experience and Job Performance: a Conceptual and Meta-Analytic Review. Personnel

Psychology, 48: 887-910.

Page 179: System Performance Analysis Considering Human-related Factors

REFERENCES

163

92. Rasmussen, J., (1982). Human Errors: a Taxonomy for Describing Human

Malfunction In Industrial Installations, Journal of Occupational Accidents, 4: 311-333.

93. Rausand, M., Hoyland, A., (2004). System Reliability Theory, Models, Statistical

Methods, and Applications (2nd Edition). Wiley: Hoboken, New Jersey, USA.

94. Reason, J.T., (1987). Generic Error Modelling System: A Cognitive Framework for

Locating Common Human Error Forms. In: J. Rasmussen et al. (Eds.), New

Technology and Human Error. Wiley: Chichester, UK.

95. Reer, B., (1994). A Probabilistic Method for Analyzing the Reliability Effect of Time

and Organizational Factors. European Journal of Operational Research, 75(3): 521-

539.

96. Seyed-Hosseini, S. M., Safaei, N., Asgharpour, M. J., (2006). Reprioritization of

Failures in a System Failure Mode and Effects Analysis by Decision Making Trial and

Evaluation Laboratory Technique. Reliability Engineering and System Safety, 91: 872-

881.

97. Singh, J. Fleming, L. (2010). Lone Inventors as Sources of Technological

Breakthroughs: Myth or Reality? Management Science, 56(1): 41-56.

98. Sisson, G. R., (2001). Hands-on Training: A Simple and Effective Method for On-the-

job Training. Berrett-Koehler Publishers: San Francisco, CA, USA

99. Soller, A. L. (2001). Supporting Social Interaction in an Intelligent Collaborative

Learning System. International Journal of Artificial Intelligence in Education, 12 (1):

40-62.

Page 180: System Performance Analysis Considering Human-related Factors

REFERENCES

164

100. Stanton, N. A., Salmon, P. M., Walker, G. H., Baber, C., Jenkins, D. P., (2005). Human

Factors Methods: A Practical Guide for Engineering and Design. Ashgate: Aldershot,

UK.

101. Stewart, D. M., Grout, J. R. (2001) The Human Side of Mistake-proofing. Production

and Operations Management, 10(4): 440-459.

102. Swain, A., Guttmann, H., (1983). Handbook on Human Reliability Analysis with

Emphasis on Nuclear Power Plant Application NUREG/CR-1278 US Nuclear

Regulatory Commission.

103. Swanson, R. A. Sawzin, S. A., (1975). Industrial Training Research Project. Bowling

Green State University. Bowling Green, OH.

104. Teyarachakul, S., Chand, S., Ward, J., (2011). Effect of Learning and Forgetting on

Batch Sizes. Production and Operations Management, 20(1): 116-128.

105. Venezia, I., (1985). On the Statistical Origins of the Learning Curve. European Journal

of Operational Research, 19: 191-200.

106. Vidic, N., (2008). Developing Methods to Solve the Workforce Assignment Problem

Considering Worker Heterogeneity and Learning and Forgetting. Doctoral

Dissertation, University of Pittsburgh.

107. Vlok, P. J., Coetzee J. L., Banjevic, D., Jardine, A. K. S., Makis, V., (2002). Optimal

Component Replacement Decisions Using Vibration Monitoring and the PHM.

Journal of the Operational Research Society, 53: 193-202.

108. Vrignat, P., Avila, M., Duculty, F., Kratz, F., (2012). Maintenance Policy: Degradation

Laws versus Hidden Markov Model Availability Indicator. Proceedings of the

Page 181: System Performance Analysis Considering Human-related Factors

REFERENCES

165

Institution of Mechanical Engineers, Part 0: Journal of Risk and Reliability. 226 (2),

137-155.

109. Wagner, E.E., Hoover, T.O., (1974). The Influence of Technical Knowledge on Position

Error in Rankings. Journal of Applied Psychology, 59: 406-407.

110. Wang, J., (2005). A Review of Operations Research Applications in Workforce

Planning and Potential Modeling of Military Training. Australian Government

Department of Defense, Systems Sciences Laboratory, DSTO-TR-1688.

111. Webb, G. K., (1994). Integrated Circuit (IC) Pricing. High Technology Management

Research, 5: 247-260.

112. Wickens, C. D., Lee, J., Liu, Y. D., Gordon-Becker, S., (2004). An Introduction to

Human Factors Engineering (2nd edition). Prentice Hall: Upper Saddle River, NJ, USA.

113. Williams, R. (2004). An Introduction to the UK Time Use Survey from A Labour

Market Perspective. Labour Market Trends, February: 63-70.

114. Woodman, R. W., J. E Sawyer, Griffin, R. W., (1993). Toward a Theory of

Organizational Creativity. Academy of Management Review, 18(2): 293-321.

115. Yang, T., Lee, R-S., Hsieh, C., (2003). Solving a Process Engineer’s Manpower-

planning Problem Using Analytic Hierarchy Process. Production Planning & Control,

14(3): 266-272.

116. Yelle, L. E., (1979). The Learning Curve: Historical Review and Comprehensive Survey.

Decision Sciences, 10(2): 302-328.

117. Zeng, X., Wong, W-K., Leung, S.Y-S., (2011). An Operator Allocation Optimization

Model for Balancing Control of the Hybrid Assembly Lines Using Pareto Utility

Page 182: System Performance Analysis Considering Human-related Factors

REFERENCES

166

Discrete Differential Evolution Algorithm. Computers and Operations Research. 39:

1145-1159.

118. Zimolong, B., Trimpop, R., (1994). Managing Human Reliability in Advanced

Manufacturing Systems, in Design of Work and Development of Personnel in

Advanced Manufacturing, Salvendy, G., and Karwowski, W., (eds.) John Wiley &

Sons: New York, NY, USA.

119. Zuashkiani, A., Banjevic, D., Jardine, A. K. S., (2009). Estimating Parameters of

Proportional Hazards Model Based on Expert Knowledge and Statistical Data. Journal

of the Operational Research Society, 60(12): 1621-1636.

Page 183: System Performance Analysis Considering Human-related Factors

167

APPENDIX A1: Observational Study

An expert observes an operator during the performance of routine duties on a Kappa

machine. The expert fills out this questionnaire for the operator’s operating and

troubleshooting of the machine.

Subject observed:

Product line assignment of subject:

Date:

Shift:

Category 1: Tool Change

Procedure followed:

1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless

Time taken to perform work:

1. Very slow (over 2 hours) 2. Slow 3. Average 4. Quick 5. Very quick (less than 45 minutes)

Machine set-up after tool change:

1. Many scrap pieces made (over 10) 2. More than average number of scrap pieces made 3. Less than average number of scrap pieces made 4. Hardly any scrap pieces made (less than 3)

Severity of worst error committed:

1. Will cause a major issue or delay 2. Will cause a minor issue or delay 3. May affect machine or cause a delay 4. Will likely go unnoticed

Page 184: System Performance Analysis Considering Human-related Factors

APPENDICES

168

Category 2: Product Measurement

Perform proper measurement procedure:

1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless

React by entering proper machine axis correction:

1. Incorrect 2. Needs help and asks for it 3. Acceptable 4. Almost perfect 5. Flawless

Recognize the need for out of process measurement:

1. Did not notice 2. Noticed late 3. Noticed right away

Severity of worst error committed:

0. Will cause a major issue or delay 1. Will cause a minor issue or delay 2. May affect machine or cause a delay 3. Will likely go unnoticed

Category 3: Troubleshooting

Dealing with machine software issues:

1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away

Dealing with machine hardware issues:

1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away

Page 185: System Performance Analysis Considering Human-related Factors

APPENDICES

169

Dealing with product visual inspection issues:

1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away

Dealing with product measurement values jumping unexpectedly:

1. Could not resolve in a reasonable period, but did not ask for help 2. Could not resolve in a reasonable period and asked for help 3. Resolved with difficulty 4. Resolved right away

Page 186: System Performance Analysis Considering Human-related Factors

APPENDICES

170

APPENDIX A2: System Experts Assessing Operators

An expert fills out this questionnaire for the main purpose of assessing the operation and

troubleshooting of an operator working on the Kappa machines. There are also some

questions on the expert’s assessment of the operator’s social interaction skills.

Operator name:

Product line assignment of operator:

Rate the operator on the following:

Section 1: Analytical Skills

1. Expected production per shift a) Very poor (<150 gears per shift) b) Poor c) Acceptable d) Good e) Very good (>300 gears per shift)

2. Expected downtime due to tool change a) Very poor (> 2 hours) b) Poor c) Acceptable d) Good e) Very good (< 45 minutes)

3. Expected scrap gears made during set-up a) Very poor (>10 gears) b) Poor c) Acceptable d) Good e) Very good (<3 gears)

4. Trouble shooting abilities a) Very poor (cannot deal with any non-routine situations) b) Poor c) Acceptable d) Good e) Very good (can trouble shoot all operator-related issues)

Page 187: System Performance Analysis Considering Human-related Factors

APPENDICES

171

5. Confidence in catching quality problems

a) Very poor (many instances of catching problems after the fact) b) Poor c) Acceptable d) Good e) Very good (hardly any problems have ever been found after the fact)

6. Confidence in detecting anomalies with machine components

a) Very poor (hardly ever) b) Poor c) Acceptable d) Good e) Very good (almost always)

Section 2: Social Skills

7. There is a machine problem the operator cannot resolve. How likely is he to ask a colleague for

the solution rather than restarting the machine to eliminate the problem?

0 …………..………..... 1 ……..…………..…. 2 ..………..…………. 3 ….………………….... 4

Not likely at all Almost certainly

8. There is a machine problem the operator cannot resolve. How likely is he to ask a supervisor or an

engineer for the solution rather than restarting the machine to eliminate the problem?

0 …………..………..... 1 ……..…………..….. 2 ..………..…………. 3 ….………………….. 4

Not likely at all Almost certainly

9. How eager will the operator be in shadowing another operator during a complete tool change

cycle, knowing he may learn a new technique?

0 …………..………..... 1 ……..…………..….. 2 ..………..………….. 3 ….…………………... 4

Not comfortable at all Completely eager

10. How likely is the operator to share a newly learned technique with colleagues, if not directly

asked by them?

0 …………..………..... 1 ……..…………..….. 2 ..………..………….. 3 ….…………………... 4

Not likely at all Almost certainly

Page 188: System Performance Analysis Considering Human-related Factors

APPENDICES

172

APPENDIX A3: Self-assessment Questionnaire

Operators fill out this questionnaire for the main purpose of assessing their technical

knowledge on the operation and troubleshooting of the Kappa machines. There are also

some questions on their general experience level and social interaction skills.

Operator name:

Product line assignment of operator:

Section 1: Gear manufacturing experience level

1. When you first joined the department, what was your level of experience with gear manufacturing? a) No machining experience. Assembly line experience only b) Machining experience, but not gears. Engines and engine components only c) Some gear machining experience, less than one year d) Some gear machining experience, more than one year

Section 2: Interaction

1. There is a machine problem you cannot resolve. How likely are you to ask a colleague for the solution rather than restarting the machine to eliminate the problem? 0 …………..………..... 1 ……..…………..…... 2 ..………..…………. 3 ….………………. 4

Not likely at all Almost certainly

2. There is a machine problem you cannot resolve. How likely are you to ask a supervisor/engineer for the solution rather than restarting the machine to eliminate the problem? 0 …………..………..... 1 ……..…………..…... 2 ..………..………. 3 ….…………………. 4

Not likely at all Almost certainly

3. How eager will you be in shadowing another operator during a complete tool change cycle, knowing you may learn a new technique? 0 …………..………..... 1 ……..…………..…... 2 ..………..………… 3 ….………………. 4

Not comfortable at all Completely eager

4. How likely are you to share a newly learned technique with colleagues, if not directly asked by them? 0 …………..………..... 1 ……..…………..…... 2 ..………..………… 3 ….………………. 4

Not likely at all Almost certainly

Page 189: System Performance Analysis Considering Human-related Factors

APPENDICES

173

Section 3: Troubleshooting

1. Looking at the product measurement chart, you notice the minimum chamfer lines are too steep. What do you do? a) Re-dress the cutting tool b) Enter a stock-division correction c) Change the tool d) Enter a Lead angle correction

2. During your visual inspection of the last piece made, you notice a corner of the gear tooth face is not cleaned up. What do you do? a) Check for run-out as a first step to ensure good incoming parts. b) Enter a Lead angle correction c) Enter an Involute angle correction d) Enter a stock-division correction

3. The top few millimeters of the gear tooth has a different finish to it than the rest of the surface. What do you check? a) Lead angle in the machine is excessive b) Involute angle in the machine is excessive c) Z-axis adjustment is excessive d) Part has a Stock-division problem from a previous process

4. During the cutting cycle, the machine makes a loud humming sound. What is the likely cause? a) The Z-axis most likely needs to be replaced b) The W axis most likely needs to be replaced c) The parts may be on the large side of the tolerance specification d) The parts may be on the small side of the tolerance specification

Section 4: Tool Change

1. How do you get the count for cutting cycles of the tool? a) Look at the last count on the tool sheet and estimate the production run since b) Go to the “Run in warm-up” menu c) Go to the “Cutting cycle” menu d) Go to the “Axis adjustment” menu

2. What is the first step after physically installing the new Dressing wheel? a) Input the dresser specific information b) Dress the cutting wheel once c) Put away the old Dressing wheel d) Initiate a warm-up cycle

3. What is the maximum number of cycles on a cutting tool before it gets dedicated to either model of car or truck? a) 50 b) 150 c) 250 d) 500

Page 190: System Performance Analysis Considering Human-related Factors

APPENDICES

174

4. During visual inspection of the cutting tool, what do you look for? a) Diamond coating off the tool surface on all teeth b) Diamond coating off the tool surface on at least 3 teeth in a row c) Diamond coating off the tool surface on at least 5 teeth in different parts of the tool d) Broken or chipped teeth on the tool

Section 5: Product Measurement

1. What test do you select on the measurement machine to measure the pulse signature left by the tool vibrations? a) Undulations b) CX Bias c) KX Bias d) Lead-Involute

2. The Crown value is approaching the upper limit of tolerance. What do you do? a) Change the tool b) Put in a direct correction c) Change the Lead angle to affect the Crown d) Change the Involute angle to affect the Crown

3. The last part checked has an asymmetric undercut. Where do you put in a correction? a) Involute angle b) Lead angle c) Stock division d) Crown

4. The Bias reading on the last part checked was in the last 10% of tolerance. What is your next step? a) Change the tool b) Enter an Involute angle correction c) Enter a Lead angle correction d) Enter an additional cutting time

Page 191: System Performance Analysis Considering Human-related Factors

APPENDICES

175

APPENDIX B: Discussion on Logistic Regression as a Validation Tool

We provide further details to the usage of another tool, logistic regression, in order to

validate the PHM’s prediction of machine failure risk.

The risk analysis model discussed in Section 3.6.3 used the PHM as a method. The

validity of this work can be confirmed by using another tool, logistic regression, to achieve

the same goal of predicting the risk of machine failure. Logistic regression is a multiple

regression but the dependent variable is a categorical dichotomy (Field, 2005). In the

context of the analysis discussed thus far, we can think of this dichotomy as machine failure

or not. The independent variables for the logistic regression can be continuous or

categorical. In our context, the example of a continuous variable is the analytical skill score;

the example of a categorical variable is the binary variable that represents working on a

certain shift. Unlike multiple regression, where the value of the dependent variable Y is

predicted from one or several independent variables, Xi, logistic regression predicts the

probability of Y occurring given the independent variable(s) Xi. The general form of logistic

regression is the following:

0 1 1 2 2

1( )

1 exp[ ( ... )]n n i

P Yb b X b X b X

The same data set analyzed by PHM can be analyzed using logistic regression. The

dependant variable, Y, is a binary {0,1} variable and takes on the value of 1 when there is a

failure in the shift and is zero otherwise. We use the same six independent variables as in

the PHM, with the details provided in Table 3.4. Our aim is to compare the probability of

Page 192: System Performance Analysis Considering Human-related Factors

APPENDICES

176

failure calculated by two tools, PHM and logistic regression, on the same data set. Ideally,

both tools should prompt the DM to take the same action, and this can act as a validation of

either tool. For this tool comparison to be meaningful, we have to use the same

independent variables. We also have to ensure the units of measure for the two methods

are the same. For example, PHM may have 1 hour as its unit of calculation for the hazard

rate, compared to the logistic regression calculating the probability of the event (failure)

over one shift (consisting of eight hours). In such an example, the probabilities obtained

from the PHM are multiplied by 8 for a fair comparison.

The dataset for the Ring gear machine is analyzed by logistic regression using the SPSS

software. When we use the same independent variables as in the PHM obtained in Eq. (10)

in Chapter 3, we obtain the following logistic regression equation, whose parameter

estimates from SPSS is shown in Table B1:

Variable Represented Parameter Sig.

X1 Night Shift 8.811 0.032

X2 Afternoon Shift-Social 0.007 0.139

X3 NS-Analytical (A) 0.065 0.068

X4 NS-Social (S) -0.113 0.039

X5 NS-Experience (E) -0.080 0.041

X6 Experience-Analytical 0.000 0.330

Constant - -2.756 0.000

Table B1: Summary of Estimated Parameters

Page 193: System Performance Analysis Considering Human-related Factors

APPENDICES

177

The Nagelkerke’s R2 is a method used in logistic regression to assess the correlation

between the predicted and observed values of the outcome (Field, 2005); it is found to be

0.026. It is expected to have a very low model fit, as measured by the R2 value. The ratio of

shifts containing a failure event to shifts without one is quite low (about 5%). In addition,

we normally expect human-related factors to affect only a small percentage (about 10%) of

failures. Furthermore, a last factor that may play a role is the fact that variable X2 is not

found to be significant in the logistic regression model (p-values > 0.1) but is included

because it was included in the PHM. Due to these reasons, it is expected to get a low value

of R2.

The hazard rates have been placed in three categories based on the course of action a

DM may take. Risk factors calculated from two situations may be drastically different but

lead to the same decision. For example, consider a case where we analyze two scenarios

yielding hazard rates of 0.0002 and 0.004; both would lead to the DM ignoring the risk,

even though one is 20 times larger in magnitude. Therefore, if we think of three categories

of “do nothing”, “monitor”, and “intervene”, the corresponding categories have been

arbitrarily selected to be “< 5%”, “5% ≤ risk < 10%”, and “≥ 10%”, respectively. The category

ranges are the same for both models. We are interested to check whether a low prediction

of hazard by the PHM is confirmed by a low probability of failure by logistic regression. The

comparison is promising as a Kendall’s correlation coefficient of 0.615 is found to be

significant (p-value < 0.01). The positive sign of this coefficient, as well as its significance,

indicate that the two tools are making the same general predictions.

Page 194: System Performance Analysis Considering Human-related Factors

APPENDICES

178

In Section 3.6.6.3 , we obtained another PHM with the expanded data set, considering all

three machines, and an additional factor of day-of-the-week. Once again, we compare the

failure prediction of this PHM with logistic regression as a secondary tool. We use the same

7 variables and apply the logistic regression model to the same data set. The results of the

two models are compared and we obtain a Pearson Correlation of Coefficient of 0.724,

significant (two-tailed) at 0.000. Details of the logistic regression model appear in Table B2:

Variable Represented Parameter Sig.

Social -0.015 0.051

Analytical -0.010 0.015

X1 Afternoon Shift or not 0.589 0.028

X2 Night Shift or not 0.791 0.003

Y1 Driven machine or not -0.462 0.053

Y2 Drive machine or not -1.031 0.000

V1 1st day of the week or not 0.573 0.009

Constant - -1.906 0.001

Table B2: Summary of Estimated Parameters

The correlation coefficient implies that about 50% of the variance of one model can be

explained by the other. Field (2005) states that Cohen (1988, 1992) has made “some widely

accepted suggestions about what constitutes a large or small effect” and proceeds to state

r = 0.5 to be a large effect. Therefore, in our case, where r = 0.724, we have a large effect,

Page 195: System Performance Analysis Considering Human-related Factors

APPENDICES

179

indicating a strong correlation between the two approaches. This in turn serves us in our

original purpose of using one approach to validate the other.

Page 196: System Performance Analysis Considering Human-related Factors

APPENDICES

180

APPENDIX C: Discussion on Obtaining the PHM

We provide further details to the model building process discussed in Section 3.6.7.2.

We divide all variables into four groups: (1) Experience and all pair wise interactions

involving it, (2) Social and all pair wise interactions involving it, (3) Analytical and all pair

wise interactions involving it, and (4) all binary variables, representing shifts, parts of the

week, and machines, along with the meaningful interaction terms.

Group 1, the Experience group:

We start out with all 7 variables in the model. The result is shown in Table C1:

Parameter Estimate p-Value

Scale 9.947 -

Shape 1.045 0.5277

Experience -0.0039 0.8519

ExpSoc -0.0012 0

ExpAna 0.00006 0.6735

ExpX1 0.0333 0

ExpX2 0.0340 0

ExpV1 0.0143 0.0002

ExpV2 -0.0092 0.1230

Table C1: PHM parameter estimation, using all variables related to operator experience level

There are three variables with p-values higher than 0.1. We shall consider the paths leading

from eliminating each one. We start with eliminating Experience, the variable with the

highest p-value. The result is shown in Table C2:

Page 197: System Performance Analysis Considering Human-related Factors

APPENDICES

181

Parameter Estimate p-Value

Scale 10.72 -

Shape 1.045 0.5293

ExpSoc -0.0012 0

ExpAna 0.00004 0.703

ExpX1 0.0331 0

ExpX2 0.0336 0

ExpV1 0.0143 0.0002

ExpV2 -0.0092 0.1212

Table C2: PHM parameter estimation continued, variable “Experience” is eliminated

We have two variables with p-values larger than 0.1. We shall consider both paths. We first

eliminate the Experience-Analytical interaction variable. The following is the result (Table

C3):

Parameter Estimate p-Value

Scale 12.04 -

Shape 1.044 0.5417

ExpSoc -0.0012 0

ExpX1 0.0335 0

ExpX2 0.0340 0

ExpV1 0.0144 0.0001

ExpV2 -0.0093 0.1187

Table C3: PHM parameter estimation continued, variable “ExpAna” is eliminated

We are left with only one p-value larger than 0.1 and eliminating it results in Table C4. This

model now consists entirely of variables with p-values significant at the 5% level.

Page 198: System Performance Analysis Considering Human-related Factors

APPENDICES

182

Parameter Estimate p-Value

Scale 12.02 -

Shape 1.034 0.6371

ExpSoc -0.0012 0

ExpX1 0.0327 0

ExpX2 0.0333 0

ExpV1 0.0154 0

Table C4: PHM parameter estimation, final model for the “Experience” group

We now go back to the model with all 7 variables (Table C1) and, this time, eliminate the

interaction term Experience-Analytical. This variable does not have the highest p-value, but

it is still larger than 0.1 and eliminating it first may result in a different end-model. The

result is shown in Table C5:

Parameter Estimate p-Value

Scale 12.15 -

Shape 1.044 0.5411

ExpSoc -0.0012 0

ExpX1 0.0334 0

ExpX2 0.0340 0

ExpV1 0.0144 0.0001

ExpV2 -0.0093 0.1181

Experience 0.0009 0.9598

Table C5: PHM parameter estimation continued from Table C1, variable “ExpAna” is eliminated

Next, we can eliminate Experience, but that would result in Table C3, which we have

already obtained. Therefore, we eliminate the interaction term of Experience and V2, the

variable representing the last day of the week. This results in Table C6:

Page 199: System Performance Analysis Considering Human-related Factors

APPENDICES

183

Parameter Estimate p-Value

Scale 12.07 -

Shape 1.034 0.6372

ExpSoc -0.0012 0

ExpX1 0.0327 0

ExpX2 0.0333 0

ExpV1 0.0154 0

Experience 0.0004 0.9817

Table C6: PHM parameter estimation continued, variable “Experience” is eliminated

The only p-value larger than 0.1 is Experience and eliminating it gets us back to the model

achieved above in Table C4.

Once again, we go back to the model represented in Table C1 and this time, the first

variable we eliminate is Experience-V2. The result is shown in Table C7:

Parameter Estimate p-Value

Scale 9.509 -

Shape 1.036 0.6179

ExpSoc -0.0012 0

ExpX1 0.0326 0

ExpX2 0.0330 0

ExpV1 0.0153 0

Experience -0.0053 0.8005

ExpAna 0.00007 0.6172

Table C7: PHM parameter estimation continued from Table C1, variable “ExpV2” is eliminated

We have two choices of variables to eliminate. We first eliminate Experience and Table C8

shows the result:

Page 200: System Performance Analysis Considering Human-related Factors

APPENDICES

184

Parameter Estimate p-Value

Scale 10.52 -

Shape 1.036 0.6206

ExpSoc -0.0012 0

ExpX1 0.0324 0

ExpX2 0.0329 0

ExpAna 0.00004 0.6655

Table C8: PHM parameter estimation continued, variable “Experience” is eliminated

We can see that Experience-Analytical is the only variable with a p-value larger than 0.1.

Eliminating it results in Table C4 which we have already obtained.

We go back to Table C7 and this time, we eliminate Experience-Analytical first. But this

would result in an identical Table to Table C6, eventually leading back to the model

obtained and shown in Table C4.

Group 2, the Social group:

We followed the same procedure as described for Group 1, to come up with all variables

related to the Social Interaction score of the operator. The result is shown in Table C9,

where all p-values are significant.

Parameter Estimate p-Value

Scale 3.496 -

Shape 1.038 0.5909

Social -0.0422 0.0059

SocAna 0.0003 0.0305

SocX1 0.0340 0

SocX2 0.0315 0

SocV1 0.0124 0.0022

ExpSoc -0.0011 0

Table C9: PHM parameter estimation, final model for the “Social” group

Page 201: System Performance Analysis Considering Human-related Factors

APPENDICES

185

Group 3, the Analytical group:

This is similar to our description for group 2. All variables related to the Analytical skill are

grouped together and the final model we obtain is shown in Table C10:

Parameter Estimate p-Value

Scale 96.69 -

Shape 1.008 0.9157

Analytical 0.0586 0.0242

ExpAna -0.0008 0

AnaX1 0.0361 0

AnaX2 0.0349 0

AnaV1 0.0141 0.0011

SocAna -0.0007 0.0138

Table C10: PHM parameter estimation, final model for the “Analytical” group

Group 4, the group with all binary terms:

In group 4, there are 6 main effects considered. These six consist of two variables, X1 and X2,

representing the three shifts; two variables, V1 and V2, representing the three parts of the

week; and two variables, Y1 and Y2, representing the three machines under analysis. The

pair wise interactions between shift variables and parts of the week are considered. The

effect of the night shift on the first day of the week may be different than the first dayshift

of the week. But interactions with machines do not make intuitive sense and are not

considered. A similar procedure as described for Group 1 is followed and the final model we

obtain is shown in Table C11:

Page 202: System Performance Analysis Considering Human-related Factors

APPENDICES

186

Parameter Estimate p-Value

Scale 343.6 -

Shape 1.126 0.1249

X1 1.906 0.0004

Y1 -0.2874 0.1669

Y2 -1.076 0

V1 -3.986 0

V2 2.374 0.0028

X1V1 6.145 0

X2V1 9.127 0

X1V2 -6.429 0

X2V2 -4.06 0.0041

Table C11: PHM parameter estimation, final model for the “binary terms” group

Final model:

The variables included in the final models from the four groups, represented by Tables C4,

C9, C10, and C11, are all put together to start the model building for a final model. A similar

procedure as described for Group 1 is followed where all possible paths leading from

variables with p-values larger than 0.1 are considered. The final mode achieved is shown in

Table C12:

Page 203: System Performance Analysis Considering Human-related Factors

APPENDICES

187

Parameter Estimate p-Value

Scale 129.2 -

Shape 1.107 0.1839

X1 3.291 0

Y1 -0.7177 0.0068

Y2 -1.375 0

V2 3.328 0.0001

X1V1 5.397 0

X2V1 8.257 0

X1V2 -7.543 0

X2V2 -4.876 0.0008

ExpSoc -0.0052 0

SocAna 0.0044 0

SocV1 -0.0555 0

Analytical -0.2805 0

Experience 0.2954 0

Table C12: PHM parameter estimation, final model, with all four groups combined

This model has 13 variables and EXAKT calculates a maximum likelihood estimator of 618.85

for it. Based on these figures, the AIC is calculated as follows:

,

where

: maximum likelihood estimator for the model

: number of variables in the model

In our case, . This value can be compared to AIC

values models we obtain from other approaches, in order for us to choose a final model for

our analysis.

Page 204: System Performance Analysis Considering Human-related Factors

APPENDICES

188

APPENDIX D: Goodness of Fit of Regression Models for Predicting Output

In Section 4.4.1, we present a set of regression equations for our case study. This appendix

discusses our statistical analysis for validating the model fit of the regression equations.

To check the validity of the regression models, we perform two steps. First, we look at

the R2 value as an indicator to assess the model fit. Values for the Driven, Drive and Ring

lines are 0.740, 0.449, and 0.564, respectively. Factors, such as linearity of residuals,

homoscedasticity, independence of errors, and the influence of outliers are assessed for the

goodness of model fit. We analyze linearity and homoscedasticity visually using graphs

which include plotting standardized residuals against standardized predicted values. A

sample of such a plot for the Driven line is presented in Figure D1. This plot shows us what

we expect from a good model fit which is having the points centered around zero (on the y-

axis); no particular non-linear relationship between the outcome and the predictor; and no

change in variance, or spread of the points in the vertical direction, along the x-axis. None

of the graphs for any of the three product lines indicates non-linearity or heteroscedasticity

For each product line, we also look at normal p-p plots of normally distributed residuals.

A sample of such plot for the Driven line appears in Figure D2. This graph is in compliance

with normality assumptions as all data points are aligned with the straight line joining (0,0)

and (1,1). In terms of indicators of multicollinearity, the Variance Inflation Factor (VIF)

values appearing in Table D1 indicate that there are no problems. The largest VIF is smaller

than 10, and the average VIF is not substantially greater than 1 (Bowerman and O’Connell,

1990). We use the Durbin-Watson indicator to test for the assumption of independent

errors. The values for Driven, Drive, and Ring models are all close to 2, which given our

Page 205: System Performance Analysis Considering Human-related Factors

APPENDICES

189

sample size and the number of predictors, indicates compliance with the assumption.

Finally, we use Cook’s distance to measure the overall influence of individual cases on the

model. The maximum Cook’s distances for the three models are all significantly less than 1,

indicating there are no influential observations. All the aforementioned checks indicate a

good model fit of our regression models.

The second step taken is to use data-splitting to cross-validate the model. We randomly

select 80% of the data set and develop another regression model which includes the same

predictor variables but with variable coefficients estimated from the reduced data set. This

second model is then applied to the remaining 20% of the data and the production volumes

are predicted. This predicted set is compared to the actual production volumes of the

remaining 20% of the data set. The two sets are compared to calculate a Pearson’s

Correlation Coefficient; for the Driven, Drive, and Ring lines, the correlation coefficients are

0.879, 0.680, and 0.773 respectively. We go one step further in the data splitting for

additional validation. For the Driven line, we proceed to do a 70-30 and a 60-40 data split.

The Pearson’s correlation coefficient is still high at 0.874 and 0.850 for the 70-30 and 60-40

respectively. The data set is randomized; the parts of the randomized data used for the 70-

30 and the 60-40 data split are different. For the 70-30 split, we take the first 651 rows of

the original 930 rows to build our model. We then apply this model to the remaining 279

rows, and compare the set of the predicted values to the actual output values. For the 60-

40 split, we take the last 558 rows of the original 930 rows to build our model and apply our

model to the remaining 372 rows.

Page 206: System Performance Analysis Considering Human-related Factors

APPENDICES

190

Figure D1: Scatter Plot of Standardized Residuals for checking assumption of

random errors and homoscedasticity

Note on Figure 2: The reason for the “discrete” look of the residuals is that skill scores

are for three individuals, assessed three times in the nine-month duration of the study.

Each individual’s skill score remains constant for the quarter.

Figure D2: Normal P-P Plot of Standardized Residuals for checking

assumption of normality

Y-axis: Standardized

differences

between observed

data and values

predicted by model

X-axis: standardized

forms of values

predicted by model

Page 207: System Performance Analysis Considering Human-related Factors

APPENDICES

191

Line Largest VIF Average VIF Durbin-Watson Cook’s Distance

Driven gear 1.546 1.55 2.018 0.011

Drive gear 3.51 3.51 2.064 0.032

Ring gear 1.359 1.359 2.008 0.036

Table D1: Indicators of Multicollinearity, Independence of Errors, and Influential Cases

Page 208: System Performance Analysis Considering Human-related Factors

APPENDICES

192

APPENDIX E: Model Properties

We present some properties of the recursive formula developed in Section 5.2 that can

reduce the calculations required over the planning horizon.

Property 1: At each stage , where , probability of any state with and

can be calculated as follows:

0, ,0 | , ,0 | , 1,2,...mp i i n i mD q p i i n i m .

In stage , indicates the machine must be operational in the last i stages,

and have been non-operational at all other previous ones. For this scenario, the machine

must fail immediately after starting at , with an initial failure probability , goes

through the repair period, and fails immediately again. The failures are repeated m times in

total before the machine becomes operational for i stages (Figure E1). Failure cannot occur

during the last i intervals, otherwise, a cannot equal i. The probability that the machine is

operational in the last i stages is .

1a: stage , with m initial failures and repair periods, 1b: stage , with i

followed by i operational intervals operational intervals, no failures

Figure E1: Relationship between and

Property 2: At any stage , the probability of any state with and can be derived

as follows:

.

0 1 2 i n n

mD mD+1 mD+i

D D D

Page 209: System Performance Analysis Considering Human-related Factors

APPENDICES

193

Since , the machine is in a repair period at stage . This indicates that in the

previous stage, , , since one less time interval had elapsed on repairing the

machine. This relationship between “ ” and “ ” will go on over all previous stages

until . In this case, , leading to .

Once , there are multiple paths possible from the previous stage. In the

previous stage, the machine may be operational and fail just before , or it may fail

immediately after the completion of a repair period. Therefore, the recursive formula has to

be used for .

Property 3: At any stage , probability of any state with is calculated as:

.

The machine has been running for the last a stages; however, since , there must be

at least one previous failure. Given the total number of intervals n, the number of

operational intervals must be i-a at the point before the occurrence of the last failure

(Figure E2). The value of d must also be zero whether the failure occurred after an

operational period or the completion of a repair. The value of a, however, is bounded

between zero and since . The probability of failure in this scenario is affected

by the expertise gained by the operator, having worked intervals.

Figure E2: Depiction of a possible history of state where

D

Page 210: System Performance Analysis Considering Human-related Factors

APPENDICES

194

APPENDIX F: Model and Property Generalization with Initial Conditions

Similar to Appendix E, in this Appendix we present some properties of the recursive formula

developed in Section 5.2. The work presented here, however, assumes a set of initial

conditions for the machine.

If we consider initial conditions at to be and , then all probabilities will be

conditional on these initial conditions. As a result, the recursive formula expressed in

Section 5.2 becomes:

0 0

0 0 00

0 0 0 00 0

( ,0, 1 1, , ), 0 0 1

( , ) ( , ,0 1, , ), 0 1 1

( 1, 1) ( 1, 1,0 1, , ),( , , , , )

i

b

p i d n i a a d D

q i i b p i b n i a a d D i n N

q i n a n p n n n i a ap i a d n i a

0 0 0

0 0 0 0

0 1 1

0, 0 1

1 ( 1, 1) ( 1, 1,0 1, , ), 0 0

1 ( 1, 1) ( 1, 1,0 1, , ), 0

d D i n N

a d D i n

q i i a p i a n i a a n d

q i i a a p i a n i a a n d

The first line represents the repair scenario. The next three lines represent the special cases

for the failure scenario. The last two lines represent the survival scenario.

Similar to Appendix E, we present properties, conditional on the initial conditions

and , that can aid us in calculating the state probabilities in future stages without the

need to use the one step recursive function.

Property 1:

.

Based on , we can conclude that the machine has been operational in every time

interval since the start. If the planning horizon is started with operator expertise at and a

machine age of , then at each operational interval, one time unit is added to these two

Page 211: System Performance Analysis Considering Human-related Factors

APPENDICES

195

parameters. The probability of survival for the first interval is , for the second is

, and so on, until the last probability of survival

.

Property 2:

At each stage , , probability of any state with and can be

calculated as follows:

0 0 0 0 0 0

-1, ,0 | , , ( , ) ( ,0) , ,0 | , ,0 , 1,2,...

mp i i n i mD i a q i a q i p i i n i i m .

We consider a stage where . In this stage, indicates the machine must be

operational in the last i stages, and have been non-operational at all other previous ones.

For this scenario, the machine fails immediately after starting at , with an initial

failure probability , goes through the repair period, and fails immediately again,

with probability . The subsequent failures are repeated m-1 times before the

machine becomes operational for i stages (Figure F1). Failure cannot occur during the last i

intervals, otherwise, a cannot equal i. The probability that the machine is operational in the

last i stages is because at the start of this interval, expertise still remains

at but working age is reset to zero.

1a: stage , with m initial failures and repair periods, 1b: stage , with i

followed by i operational intervals operational intervals, no failures

Figure F1: Relationship between and

n 0 1 2 i

D D D

0 1 2 i n

Page 212: System Performance Analysis Considering Human-related Factors

APPENDICES

196

Property 3: At any stage , the probability of any state with can be simply

derived as follows:

.

This property follows directly from the repair scenario of the recursive function. Once

, we are no longer following from just one state from the previous stage. In the

previous stage, the machine may be operational and fail just before , or it may fail

immediately after the completion of a repair period. Therefore, the recursive formula has to

be used for .

Property 4: At any stage , the probability of any state with can be

calculated as follows:

The machine has been running for the last a stages; however, since , there must be

at least one previous failure. Given the total number of intervals n, the number of

operational intervals must be i-a at the point before the occurrence of the last failure

(Figure F2). The value of d must also be zero whether the failure occurred after an

operational period or the completion of a repair. The value of a, however, is bounded

between zero and since . The probability of failure in this scenario is affected

by the expertise gained by the operator, having worked intervals. The second line

covers the case where exactly one failure occurred at the end of the initial intervals.

All other scenarios are covered by the first line.

Page 213: System Performance Analysis Considering Human-related Factors

APPENDICES

197

Figure F2: A possible depiction of history of state where

Effect of Initial Conditions on Expected Output

In Section 5.2, the expected output is expressed using the assumption that and

, when . In the presence of some previous level of operator learning,

regardless of machine working age, the output formula can be expressed as follows:

, where

For expertise level i, we have:

. If we define , then the expected

total output for the entire horizon is as follows:

.

D

Page 214: System Performance Analysis Considering Human-related Factors

APPENDICES

198

APPENDIX G: Data Set Used for Empirical Studies

We present a sample of the data set that we have used to perform the case study analyses

in Chapters 3, 4, and 5. The data set has been analyzed using multiple tools, approaches,

and software programs.

We will describe the various analyses performed at different stages of this

dissertation’s Chapters:

There is production data on the four machines: Driven, Drive, Pinion, and Ring. We also

have the personnel record of which operators were on shift at each time and date. We have

the expert assessment reports of the operators’ skills. The production data for the Pinion

machine is incomplete and is therefore ignored. The three remaining machines have a

combined total of 3,049 records, two records per machine for the 9-month duration. The

records include operational periods as well as failures, and there are 119 failures and 11

suspensions, making a total “event” sample size of 130. The assessment data on the 12

operators are complete for the nine months of the assessment period, which was repeated

quarterly.

The production data set described above is used in various ways throughout this

dissertation. Summary of the usages are provided in Table G1. Some of the analysis covers

certain parts only; others consider the entire set. In addition, we use different tools and

approaches to analyze the data. We start the analysis in Chapter 3 by considering the Ring

machine only and the three operators working on it. We use a PHM as a tool to calculate

the hazard rate, considering machine age and a few HR factors. We then seek to validate

the PHM by using another tool, logistic regression, on the same segment of the data set.

Page 215: System Performance Analysis Considering Human-related Factors

APPENDICES

199

The data specific to the Ring machine, as well as the three operators assigned to it, is

analyzed by logistic regression to estimate the probability of machine failure. In the latter

part of Chapter 3, we analyze the entire data set using a PHM again. The three machines are

combined, along with the nine operators who are assigned to these three machines on the

three shifts. Indicator variables are used to distinguish amongst the machines and an

additional variable, day-of-the-week, is considered. This variable was not originally a part of

the data set but could easily be added afterwards based on the date field that was a part of

the production data. The PH modeling is done using the software EXAKT; we set up the data

set accordingly. An example of this data set analyzed by EXAKT is shown in Table G2; the full

data set is presented on a CD in the back cover of the dissertation document.

Similar to the early part of Chapter 3’s empirical study analysis, the data set is once again

analyzed on a per-machine basis in Chapter 4. The entire nine months for the Driven, Drive,

and Ring gear machines are used. Chapter 3 focuses on the HR factors leading to machine

failures whereas Chapter 4 focuses on HR factors resulting in the quantity of production

output. The perspective we take in Chapter 3 is that a lower level of skill is likely to lead to a

higher hazard rate; the perspective in Chapter 4 is that a lower level of skill results in a

lower production output. We do not consider the machine failures the operators cause;

only how many produce per unit time when the machine is operational.

The tool we use is multiple regression where we predict the production output based on

HR factors. We set up the data set slightly differently and use the SPSS software to perform

the multiple regression analysis; an example is shown in Table G3. Since we do not assume

Page 216: System Performance Analysis Considering Human-related Factors

APPENDICES

200

the skill levels to be static, we use learning curves to represent the dynamic nature of the

skill components over the planning horizon. The learning curve equations are obtained from

the multiple expert assessments for each operator.

To validate the regression equations we obtain, we split the randomized data in various

ways and compare model forecasts of the production output with actual output produced

by the system. This is described in detail in Appendix D. In addition, we use the data set to

obtain a learning curve for each skill component for every operator.

The analysis of the data set in Chapter 5 is similar to the latter part of Chapter 3 where

the data set is considered in its entirety and the additional factor of day-of-the-week is

added. The final PHM obtained in Chapter 3, using the entire data set for all three

machines, is used to calculate the probability of failure at each time interval. The learning

curves obtained in chapter 4 are used to supply the PHM with the appropriate factor value.

The regression equations in Chapter 4 are used to forecast the production output once we

calculate the expected number of operational intervals. But the overall framework of

Chapter 5 is a Markov chain analysis. We discretize the planning horizon and use the PHM

and learning curves at each time interval. At the end of the planning horizon, we use the

regression equations to calculate the expected production output over the entire planning

horizon.

In addition to using the data set for developing the various PH models presented in

Chapter 3, we also use it for model validation. This is described in Section 5.3.1, we develop

a PHM based on the data set for the period January 1 to August 15, use the obtained model

Page 217: System Performance Analysis Considering Human-related Factors

APPENDICES

201

to forecast the risk for the period August 18 to October 9, and compare the forecast to

actual results from the data set.

Our usage of the data set as described above is summarized in Table G1:

Chapter/Section Specific usage Statistical Model Approach/Framework

3.6.3 – 3.6.6 - Per-machine basis

(Ring machine only)

- PHM

- Logistic regression (Appendix)

- Analyzing risk of failure

3.6.7 - Entire data set - PHM

- Logistic regression (Appendix)

- Analyzing risk of failure

4 - Per-machine basis - Multiple regression

- Learning curves

- Production output

forecasting

5 - Entire data set - PHM

- Multiple regression

- Learning curves

- Markov chain

Table G1: Usage of data set in the empirical works presented throughout the dissertation

Page 218: System Performance Analysis Considering Human-related Factors

APPENDICES

202

Data Set sample: EXAKT usage for analysis in Chapter 3:

Machine Working Age Experience Social Analytical X1 X2 Y1 Y2 V1 V2

Driven 0 42.86 70.83 27.12 0 1 1 0 1 0

Driven 8 100.00 77.08 78.67 0 0 1 0 1 0

Driven 16 42.86 41.15 23.05 1 0 1 0 1 0

Driven 24 42.86 70.83 27.12 0 1 1 0 0 0

Driven 32 100.00 77.08 78.67 0 0 1 0 0 0

Driven 40 42.86 41.15 23.05 1 0 1 0 0 0

Driven 48 42.86 70.83 27.12 0 1 1 0 0 0

Driven 56 100.00 77.08 78.67 0 0 1 0 0 0

Driven 64 42.86 41.15 23.05 1 0 1 0 0 0

Driven 72 100.00 77.08 78.67 0 1 1 0 1 0

Driven 80 42.86 41.15 23.05 0 0 1 0 1 0

Driven 88 42.86 70.83 27.12 1 0 1 0 1 0

Driven 96 100.00 77.08 78.67 0 1 1 0 0 0

Driven 104 42.86 41.15 23.05 0 1 1 0 0 0

Driven 112 42.86 70.83 27.12 1 0 1 0 0 0

Driven 120 100.00 77.08 78.67 0 1 1 0 0 0

Driven 128 42.86 41.15 23.05 0 0 1 0 0 0

Driven 136 42.86 70.83 27.12 1 0 1 0 0 0

Driven 144 100.00 77.08 78.67 0 1 1 0 0 0

Driven 152 42.86 41.15 23.05 0 0 1 0 0 0

Driven 160 42.86 70.83 27.12 1 0 1 0 0 0

Driven 168 100.00 77.08 78.67 0 1 1 0 0 1

Driven 176 42.86 41.15 23.05 0 0 1 0 0 1

Driven 184 42.86 70.83 27.12 1 0 1 0 0 1

Driven 192 42.86 41.15 23.05 0 1 1 0 1 0

Driven 7.5 42.86 70.83 27.12 0 0 1 0 1 0

Driven 15.5 100.00 77.08 78.67 1 0 1 0 1 0

Driven 23.5 42.86 41.15 23.05 0 1 1 0 0 0

Driven 31.5 42.86 70.83 27.12 0 0 1 0 0 0

Driven 39.5 100.00 77.08 78.67 1 0 1 0 0 0

Driven 47.5 42.86 41.15 23.05 0 1 1 0 0 0

Driven 55.5 42.86 70.83 27.12 0 0 1 0 0 0

Driven 63.5 100.00 77.08 78.67 1 0 1 0 0 0

Driven 71.5 42.86 41.15 23.05 0 1 1 0 0 0

Table G2: Data set sample for Chapter 3 analysis, using the EXAKT software

Description of indicator variables: X1, X2: representing three shifts; Y1, Y2: representing three machines; V1,

V2: representing start, middle and end of the week.

Page 219: System Performance Analysis Considering Human-related Factors

APPENDICES

203

Data Set sample: SPSS usage for analysis in Chapter 4:

Machine Shift Production Experience Social Analytical X1 X2

DRVN 311 100.00 77.08 78.67 0 1

DRVN 185 42.86 70.83 27.12 1 0

DRVN 286 100.00 77.08 78.67 0 1

DRVN 175 42.86 41.15 23.05 0 0

DRVN 240 57.14 42.19 35.22 0 0

DRVN 307 100.00 75.00 92.33 0 1

DRVN 215 57.14 42.19 35.22 0 0

DRVN 242 57.14 67.71 45.94 1 0

DRVN 258 57.14 67.71 45.94 1 0

DRVN 307 100.00 75.00 92.33 0 1

DRVN 389 100.00 67.71 100.00 0 1

DRVN 278 57.14 40.10 48.87 0 0

DRVN 327 71.43 70.83 63.64 1 0

DRVN 347 71.43 70.83 63.64 0 0

DRVN 319 100.00 67.71 100.00 0 0

DRVN 261 57.14 40.10 48.87 1 0

DRVN 326 100.00 67.71 100.00 0 1

DRVN 265 57.14 40.10 48.87 1 0

DRVN 297 71.43 70.83 63.64 0 1

DRVN 248 42.86 41.15 23.05 1 0

DRVN 282 100.00 77.08 78.67 0 0

DRVN 312 100.00 77.08 78.67 0 0

DRVN 167 42.86 41.15 23.05 1 0

DRVN 229 42.86 70.83 27.12 0 1

DRVN 191 42.86 70.83 27.12 0 1

DRVN 318 100.00 75.00 92.33 0 1

DRVN 216 57.14 42.19 35.22 0 0

DRVN 231 57.14 67.71 45.94 1 0

DRVN 233 57.14 67.71 45.94 1 0

DRVN 204 57.14 42.19 35.22 0 0

DRVN 295 100.00 75.00 92.33 0 1

DRVN 213 57.14 42.19 35.22 0 0

DRVN 243 57.14 42.19 35.22 0 0

DRVN 340 100.00 67.71 100.00 0 0

DRVN 243 57.14 40.10 48.87 0 1

DRVN 296 71.43 70.83 63.64 0 0

DRVN 357 100.00 67.71 100.00 1 0

Table G3: Data set sample for Chapter 4 analysis, using the SPSS software