74
The Pennsylvania State University The Graduate School Department of Industrial and Manufacturing Engineering SELECTION OF A PREFERENCE AGGREGATION METHOD FOR EMERGENCY ROOM NURSE TRIAGE DECISIONS A Thesis in Industrial Engineering and Operations Research by Erica B. Fields 2009 Erica B. Fields Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science May 2009

SELECTION OF A PREFERENCE AGGREGATION METHOD FOR EMERGENCY

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

The Pennsylvania State University

The Graduate School

Department of Industrial and Manufacturing Engineering

SELECTION OF A PREFERENCE AGGREGATION METHOD FOR

EMERGENCY ROOM NURSE TRIAGE DECISIONS

A Thesis in

Industrial Engineering and Operations Research

by

Erica B. Fields

2009 Erica B. Fields

Submitted in Partial Fulfillment

of the Requirements

for the Degree of

Master of Science

May 2009

ii

The thesis of Erica B. Fields was reviewed and approved* by the following:

Gül E. Kremer

Associate Professor of Engineering Design and Industrial Engineering

Thesis Advisor

Russell Barton

Professor of Supply Chain and Information Systems

Richard J. Koubek

Professor of Industrial and Manufacturing Engineering

Peter and Angela Dal Pezzo Department Head Chair

*Signatures are on file in the Graduate School

iii

ABSTRACT

Most of the Emergency Departments (EDs) in major areas are often overcrowded.

Therefore, in order to recognize who is most in need of care, most EDs utilize a triage system to

sort patients by severity of illness or injury. The triage is a decision-making process with which

patients are prioritized according to their medical condition and chance of survival on arrival at

the ED. The Emergency Severity Index (ESI) is a five-level triage system used by a majority of

hospitals to assist in the prioritization. However, there is still a lot of subjective decision-making

in the process which leads to discrepancies among nurses. In this work, we seek to determine a

suitable aggregation method that best represents the individual nurse prioritizations and benefits

an ED nurse staff by simplifying their decision-making. We chose five methods from the

literature to explore, and through the application of utility theory combined with an expert

opinion, determined the best method. The similarities and differences between the methods are

also discussed. To this end, this study first analyzes the discrepancies in prioritizing patients from

nurses at different hospitals and within the same hospital. Nurses were given the same exercise

of assigning an ESI number to several fictionalized patients and ranking them in order of

importance. Using Spearman‘s rank correlation comparison method, the results show that there

are differences in ranking the patients, among nurses at different hospitals and even within the

same hospital.

With this basis as justification for our work, utility theory is applied to select the best

aggregation method for the situation. Next, four rank aggregation methods are applied to the

prioritization data and then an expert evaluates the results and judges them on practicality and

acceptability. The proposed recommendation for preference aggregation is the method of the

estimation of utility intervals, and actually is different than the utility theory recommendation.

iv

Expert opinion is highly valued in a decision-making environment such as this, where experience

and intuition are key to successful job performance and outcomes.

v

TABLE OF CONTENTS

LIST OF FIGURES ................................................................................................................. vii

LIST OF TABLES ................................................................................................................... viii

ACKNOWLEDGEMENTS ..................................................................................................... x

1 INTRODUCTION ........................................................................................................... 1

1.1 Problem .................................................................................................................... 1 1.2 Motivation ................................................................................................................ 2 1.3 Importance ................................................................................................................ 3

2 LITERATURE REVIEW ................................................................................................ 4

2.1 Triage in the Emergency Department ...................................................................... 4 2.2 Emergency Severity Index (ESI) .............................................................................. 7 2.3 Decision-Making in Triage ...................................................................................... 8 2.4 Aggregation Methods ............................................................................................... 10

2.4.1 Aggregation through the Estimation of Utility Intervals ................................. 13 2.4.2 Parallel Rank Aggregation .............................................................................. 14 2.4.3 Three Mathematical Programming Models ..................................................... 14 2.4.4 Aggregation using Ordered Weighted Averaging (OWA) Operator

Weights ............................................................................................................. 15

3 DATA COLLECTION AND METHODOLOGY ........................................................... 17

3.1 Data Collection ......................................................................................................... 18 3.2 Methods .................................................................................................................... 19

3.2.1 Exploration of Discrepancies in Decision-Making ......................................... 19 3.2.2 Utility Theory .................................................................................................. 20 3.2.3 Application of Aggregation Methods to Data ................................................. 31 3.2.4 Expert Judgment on Applied Methods ............................................................ 31

4 RESULTS AND DISCUSSION ...................................................................................... 33

4.1 Discrepancy Analysis ............................................................................................... 34 4.2 Utility Theory-Based Assessment of Methods ......................................................... 36 4.3 Results of the Rank Aggregation Methods .............................................................. 38

4.3.1 Results of the Method utilizing the Estimation of Utility Intervals ................ 38 4.3.2 Results of the Method utilizing OWA operator weights, including the

Borda-Kendall Method ..................................................................................... 39 4.3.3 Results of the Method of Three Mathematical Programs ................................ 41

4.4 Expert Opinion ......................................................................................................... 42 4.5 Preference Aggregation Method Recommendation ................................................. 43

5 SUMMARY AND CONCLUSIONS .............................................................................. 45

vi

REFERENCES ........................................................................................................................ 48

Appendix A: Interview Protocol .............................................................................................. 53

Appendix B: Assessment of Alternatives ................................................................................ 57

Appendix C: Remaining Footrule and Correlation Data ......................................................... 60

vii

LIST OF FIGURES

Figure 1: ESI Triage Algorithm (Gilboy et al., 2005, pp.16) ................................................... 8

Figure 2: Research Methodology ............................................................................................. 17

Figure 3: ESI and Ranking Exercise ........................................................................................ 19

Figure 4: Hierarchical Model of Objectives, with attributes in parentheses. ........................... 22

Figure 5: SAU Function for Accuracy ..................................................................................... 26

Figure 6: SAU Function for Improvement ............................................................................... 26

Figure 7: SAU Function for Flexibility ................................................................................... 27

Figure 8: SAU Function for Time/Effort ................................................................................. 27

Figure 9: SAU Function for Complexity ................................................................................. 28

Figure 10: SAU Function for Implementation ......................................................................... 28

Figure 11: Determining k1 for Accuracy .................................................................................. 29

Figure 12: Determining k2 for Improvement ........................................................................... 30

Figure 13: Remaining k Values. ............................................................................................... 30

Figure 14: Boxplots, Means, and Standard Deviations of Footrule Calculations .................... 34

viii

LIST OF TABLES

Table 1: Three-Level Acuity System (Grossman, 1999) ......................................................... 5

Table 2: Five-Level Acuity System (Grossman, 1999) ........................................................... 6

Table 3: Scale for Attributes .................................................................................................... 24

Table 4: Performance matrix ................................................................................................... 24

Table 5: Lottery and SAU Evaluation for Accuracy ................................................................ 25

Table 6: Lottery and SAU Evaluation for Improvement ......................................................... 26

Table 7: Lottery and SAU Evaluation for Flexibility .............................................................. 26

Table 8: Lottery and SAU Evaluation for Time/Effort ............................................................ 27

Table 9: Lottery and SAU Evaluation for Complexity ............................................................ 27

Table 10: Lottery and SAU Evaluation for Implementation .................................................... 28

Table 11: Prioritization Data Set 1 - Rankings from Susquehanna Health Williamsport

Hospital ............................................................................................................................ 33

Table 12: Prioritization Data Set 2: Rankings from Mount Nittany Medical Center .............. 33

Table 13: Prioritization Data Set 3: Rankings from Hershey Medical Center ......................... 33

Table 14: Footrule Data Set 1 - Pairwise Comparisons among all SHWH Nurses ................. 34

Table 15: T-test probabilities ................................................................................................... 35

Table 16: A Sample Chart of Correlation Pairing - SHWH vs. MNMC ................................. 35

Table 17: Spearman Rank Correlation Results ........................................................................ 35

Table 18: Overall Utility Results ............................................................................................. 37

Table 19: Estimation of Utility Intervals Method: Results ...................................................... 38

Table 20: Weights Determined by OWA Operator Weights ................................................... 39

Table 21: OWA Operator Method and BK Method: Results ................................................... 40

Table 22: Three Mathematical Programs: Results ................................................................... 41

Table 23: Footrule Data Set 2 - Pairwise Comparisons among all MNMC Nurses................. 60

ix

Table 24: Footrule Data Set 3 - Pairwise Comparisons among all HMC Nurses .................... 60

Table 25: Footrule Data Set 4 - Pairwise Comparisons between SHWH and MNMC

Nurses .............................................................................................................................. 61

Table 26: Footrule Data Set 5 - Pairwise Comparisons between SHWH and HMC Nurses ... 61

Table 27: Footrule Data Set 6 - Pairwise Comparisons between MNMC and HMC Nurses .. 62

Table 28: Correlation Data Set 2 - Pairwise Comparisons among all SHWH Nurses ............. 62

Table 29: Correlation Data Set 3 - Pairwise Comparisons among all MNMC Nurses ............ 62

Table 30: Correlation Data Set 4 - Pairwise Comparisons among all HMC Nurses ............... 63

Table 31: Correlation Data Set 5 - Pairwise Comparisons between SHWH and MNMC

Nurses .............................................................................................................................. 63

Table 32: Correlation Data Set 6 - Pairwise Comparisons between MNMC and HMC

Nurses .............................................................................................................................. 64

x

ACKNOWLEDGEMENTS

First, I would like to thank my Lord and Savior, Jesus Christ, for loving me and blessing

me with the gifts and talents that I have and for orchestrating my Penn State experience the way

He has. I would also like to thank my advisor, Dr. Gül Kremer, for the much needed guidance in

preparation for this work and her other advisees for their support and help, especially David

Claudio. Thanks also to Dr. Russell Barton for his support, and Dr. Carol Smith and the nursing

staff at Susquehanna Health Williamsport Hospital, Mt. Nittany Medical Center, and Hershey

Medical Center for their valuable participation in this thesis work.

My gratitude is also extended to that special group of Penn State graduate students for

their care and concern and the numerous help they‘ve provided to ensure that I‘d thrive here as a

person and student. Finally, I would like to thank my family and friends for their love, faith,

encouragement and all the fun times we had that gave me such refreshing and needed breaks from

my school work.

CHAPTER ONE

1 INTRODUCTION

1.1 Problem

The United States spends more of its gross domestic product (GDP) on its health system

than any other country in the world, but ranks 37 out of 191 countries, according to its

performance (WHO, 2005). It is difficult to identify exactly the cause of such poor performance

on the world stage, but with Americans living longer today, the need for a healthcare system that

provides excellent patient care in the most efficient manner is not going to decrease. Many

methods and tools that have been used in other industries to improve productivity and increase

quality and customer service may also be applied in the healthcare setting. Although there are

readily applicable engineering solutions to aspects of healthcare, there is much complexity in this

setting. The complexity is due largely in part to the fact that health conditions vary from patient to

patient. So, the amount of care and length of time for healthcare delivery to a patient would vary,

because these things greatly depend on the health condition (Reid, 2005).

An issue that is also present in the health system is overcrowding, where there are more

patients waiting for service or treatment than available resources or staff to attend to them. This

can be seen prevalently in Emergency Departments (EDs). Furthermore, some patients present at

the ED with non-urgent conditions (Patel et al., 2007; Buesching et al., 1985). To maintain order

in the ED and provide care to the patients who are most in need of it first, most EDs utilize a

triage system to sort patients in order of the severity of their condition (Andersson et al., 2006;

Beveridge, 1998). While most critical patients are seen immediately, others are initially assessed

2

and then sent back out to the waiting room until they can be seen. Thus, some patients end up

waiting for long periods of time.

As patients are waiting, their condition may worsen or improve. This is the inherent

dynamic nature of healthcare in the ED and must be considered by nurses and other staff when

making decisions of who will receive treatment next. Frequent reassessment of patients is

necessary to ensure that no patient reaches a life-threatening state without receiving the proper

care. In particular, vital signs such as temperature, blood pressure, and heart rate may be changing

and there is uncertainty in the nurse decision-making process as to which signs or combinations

of signs (and their specific levels) are more important to base a decision on. The development of

decision support aids to make decision-making easier or more expedient for nurses could help

here.

1.2 Motivation

The motivation for this research is the improvement of the US healthcare system by

applying decision-making tools to assist in the decision-making process of ED nurses. The

working environment of an ED nurse can be very hectic and stressful at times, as it involves

dealing with a lot of patients in the waiting room, triaging and reassessing patients accordingly,

and caring for patients who are receiving care at the ED. These conditions are further amplified

because many times there is a limited staff. Many judgments need to be made as they work,

probably the most important being the triage category assigned to each patient, as this influences

the priority the patient will have for receiving treatment. An aid can reduce the stress and strain

on nurses and also help reduce their workload.

This thesis analyzes discrepancies found in decisions made across nurses, and explores

the aggregation of preferences in triaging and prioritizing a patient according to the Emergency

3

Severity Index (ESI). The ESI outlines five categories with clinically meaningful differences in

projected resource needs and therefore, associated operational needs (Gilboy et al., 2005;

Zimmermann, 2001). The preferences need to be aggregated because studies have shown that

much of decision-making is based on a nurse‘s experience, knowledge, and intuition (Patel et al.,

2007; Andersson et al., 2006; Cone and Murray, 2002). Since these aspects differ from nurse to

nurse, their determination of an ESI category and subsequent prioritization of a patient may

differ, especially for patients who are not the most urgent.

1.3 Importance

The ramifications of this work are far-reaching and have great positive impact. If

incorporated in decision support tools, an aggregation will provide an accurate overall

representation of nurse decisions and aid in determining which patient should be seen next, if the

patients have the same ESI level. Optimally, this will assist in increasing efficiency and

productivity in the ED.

CHAPTER TWO

2 LITERATURE REVIEW

In this chapter of the thesis, a literature review is presented which discusses previous

studies and relevant information that is related to the topic of the work. It begins by addressing

Triage in the Emergency Department, followed by the Emergency Severity Index. Then, prior

studies on decision-making in triage are presented. Finally, an overview of types of preference

aggregation methods is given, and the individual methods examined in this study are discussed.

2.1 Triage in the Emergency Department

In order to effectively distribute limited resources in the Emergency Department, most

hospitals implement a triage system to prioritize patients based on their presenting medical

condition and their chance of survival on arrival to the ED (Andersson et al., 2006). Triage is a

dynamic process in decision-making and the determination of who needs the most immediate care

must be reassessed as contextual factors change and additional patient information becomes

available (Patel et al., 2007). For example, if a patient has been waiting some time, the levels of

their vital signs may have changed. If so, their condition may be more critical now than during

the initial triage, so they may need to be seen sooner than initially projected, if not immediately.

During the initial triage assessment, the nurse checks the vital signs and physical condition of the

patient (e.g., blood pressure, temperature, oxygen saturation level, respiration rate, and pulse) in

addition to inquiring about their medical history, current problem and symptoms, and overall state

of health (Andersson et al., 2006). Then, the decision is based off of this initial assessment.

5

To provide somewhat of a standard for triage across hospitals, three-level and five-level

systems were developed and used by EDs in the United States. Five-level systems have gained

acceptance over three-level systems in recent years, as researchers have demonstrated the

effectiveness and reliability of the former, in comparison to the latter (Beveridge et al., 1999;

Travers et al., 2002; Tanabe et al. 2004). Examples of a three-level system and a five-level system

are provided in Table 1 and Table 2.

Table 1: Three-Level Acuity System (Grossman, 1999)

Level

Acuity

Treatment and

Reassessment Time

Sample Conditions

Level 1

Emergent

Immediately

Cardiac arrest Severe respiratory distress

Seizure Cardiac chest pain

Anaphylaxis Uncontrolled hemorrhage

Coma Severe head trauma

Multiple trauma

Open chest/ abdominal

wound

Profound shock

Poisoning with neurological

changes

Major burn Overdose rapidly acting drug

or tricyclic antidepresant

Active labor patient

Level 2

Urgent

15-120 minutes

Alcohol intoxication Abdominal pain

Drug ingestion Noncardiac chest pain

Urinary retention Severe emotional distress

Renal calculi Minor chest pain

Laceration Eye injury-vision intact

Closed fracture Bleeding, stable vital signs

Level 3

Nonurgent

2-4 hours

Rash Strain and sprain

Sore throat Earache

6

Table 2: Five-Level Acuity System (Grossman, 1999)

Level

Acuity

Treatment and

Reassessment Time

Sample Conditions

Level 1

Critical Condition

Immediately

Cardiac arrest Severe respiratory distress

Seizure Cardiac chest pain

Anaphylaxis Uncontrolled hemorrhage

Coma Severe head trauma

Multiple trauma Open chest/ abdominal wound

Profound shock

Poisoning with neurological

changes

Level 2

Unstable Condition

5-15 minutes

Major fracture Attempted suicide

Severe headache Sexual assault survivor

Acute asthma attack Active labor patient,

Aggressive patient Eye injury with loss vision

Pregnant with active bleeding

Overdose rapidly acting drug or tricyclic antidepressant

Major burn

Level 3

Potentially Unstable

30-60 minutes

Alcohol intoxication Abdominal pain

Drug ingestion Noncardiac chest pain

Urinary retention Severe emotional distress

Renal calculi Minor chest pain

Laceration Eye injury-vision intact

Closed fracture Bleeding, stable vital signs

Level 4

Stable

1-2 hours

Cystis Minor bites

Male STD Vaginal discharge

Sore throat Constipation

Abscess Strain and sprain

Minor burn Earache

Level 5

Routine

4 hours

Routine physical Suture removal, no complications

Bruise Prescription refill

7

2.2 Emergency Severity Index (ESI)

The Emergency Severity Index (ESI) is a five-level triage system that outlines categories

according to the patient‘s physical condition as well as the expected amount of resources a patient

would need. Resources include doctors, nurses, other staff and also, technological devices such as

X-rays, Magnetic Resonance Imaging (MRI) Scanners, and Electrocardiographs. Categories are

actually defined by the significant clinical difference in resources needed from one to the next

(Gilboy et al., 2005; Zimmermann, 2001). The most acutely ill patients are categorized at the

highest level, level 1, and a lower-level patient is classified in level 3, 4, or 5 by his or her

anticipated resource needs (Tanabe et al., 2004). This helps determine if the patient will complete

additional assessments, begin receiving treatment, or remain in the waiting room until resources

are available for them.

The fourth revision of the ESI Algorithm was presented in Gilboy et al. (2005) and

appears in Figure 1. Anyone requiring immediate and life-saving attention needs to be assigned to

level 1. High-risk situations and patients under great distress should be at level 2. It is to be noted

that vital signs play a large role in the semi-urgent and non-urgent levels. Adverse changes in

vitals over time or dangerous levels of vitals may result in a shift in designation.

8

Figure 1: ESI Triage Algorithm (Gilboy et al., 2005, pp.16)

2.3 Decision-Making in Triage

Much research has been done concerning the triage decision-making process. Studies

show the main factors nurses consider when triaging patients. Andersson et al. (2006) posits that

nurses consider internal factors and external factors in assigning the patient‘s priority. Internal

factors refers to the nurses skills and personal capacity. Skills are knowledge, intuition, and

9

experience, while a nurse‘s confidence, courage, and rationality defines his or her personal

capacity. A nurse‘s workload and practical arrangement are examples of external factors. The

number of patients waiting for the nurse and other work required of the nurse is the nurse‘s

workload. Practical arrangement, however, is more person-specific. It encompasses the personal

priorities a nurse brings to the job in addition to, and hopefully not in opposition to, the priorities

established by clinical rules. For example, a nurse may have a penchant for seeing children

treated as soon as possible, even though there may be an adult patient who should rightfully be

seen ahead of the child, either due to a more urgent condition or because that person has been

waiting with a semi- or non-urgent condition longer than the child has. In dealing with

uncertainty in particular, it has been found that nurses use the representativeness heuristic most in

determining the triage category of a patient (Cioffi, 1998). This strategy includes recalling a

similar situation experienced in the past to base the current triage judgment on.

Further, Considine et al. (2007) examined the independent roles that factual knowledge

and experience play in triage decisions. They define knowledge as factual, procedural, and

conceptual; and proper integration of the three types results in knowledge applicable to a range of

clinical situations, not only triage. Also, experience is defined in terms of three criteria: passage

of time, gaining skills and knowledge, and exposure to an event. The paper finds that although

knowledge and experience are linked, factual knowledge appears to play a more important role in

triage decisions. Additionally, no significant relationship has been found between experience and

improved decision-making. There has also been no relationship found between personal

characteristics of nurses (e.g., experience and triage education) and their ability to triage and the

accuracy of their triage decisions (Göransson et al., 2005).

Gurney (2004) discusses that when several patients have the same acuity level, it forces

the nurse to use additional knowledge to discern the order in which they are seen. For sake of

example, assume the patients being decided upon all arrived at the ED at the same time and are

10

triaged in the same category. Assume that some time has passed before it is appropriate to call

back the next patient. For dynamic situations such as these, ―the decision maker identifies

relevant environmental cues, and matches them to plausible goals/tasks combinations, and

identifies familiar situations‖ (Constanze et al., 2005, pp. 166). Thus, Constanze et al. suggest the

decision makers (DMs) look for expected results according to familiar situations, as well as cues

which will let them compare this current situation with a similar one from the past. If they find a

successful match, the decision maker proceeds to act similarly as he or she did the previous time.

Thus, nurses would remember outcomes from past similar situations and accordingly select the

next patient to be seen. In unfamiliar situations, deliberations are necessary, which means that

decision makers ―…either identify a familiar situation in an unfamiliar setting or create a new

solution‖ (Constanze et al., 2005, pp.166). These processes are called simulation and story

building.

There is no present literature on the quantitative analysis of nurse triage decisions or the

application of preference aggregation to a multi-person decision-making problem such as this

one. With this work, we attempt to fill that void.

2.4 Aggregation Methods

How to aggregate individual preferences into one overall preference representing the

group, or a consensus, has been studied extensively (Yang, 2005). The history of aggregation

methods thus far can be categorized into four areas: early efforts using weighted sums, studies of

the simple group consensus, the use and incorporation of distance measures, and alternative

frameworks.

The earliest effort to study the problem of rank aggregation was done by Borda, and

Kendall later studied it from a statistical viewpoint, arriving at the same conclusion (Borda, 1784;

11

Black, 1958; Kendall, 1962). Borda explored an election problem and proposed determining the

rank of candidates according to the sum of ranks given to them by voters. If there were m

candidates and each voter ranked each candidate with no ties, the highest rank received a weight

of m, the next highest rank received a weight of m-1, continuing so that the lowest rank received a

weight of 1 (Wang et al., 2007a). The final rankings are determined by a weighted sum, where the

alternative with the highest sum is most preferred followed by the other alternatives in descending

sum order. Because this method determines weights to be used in a weighted sum, it is called a

weight-determining method. With its simple calculations, the Borda-Kendall (BK) method, as it is

commonly referred to, is the most widely used technique for rank aggregation. Many other

weight-determining methods have been developed from this one.

The simplest and perhaps, most frequently used, way to draw a group consensus is the

majority rule. This rule dictates that the alternative receiving the most votes is declared the

winner. Arrow (1951) decided that any aggregation or consensus drawn from individual

preferences needs to satisfy certain social welfare axioms. Inada (1964) and others since have

studied Arrow‘s axioms further and developed methods satisfying them, including one based on

the majority rule idea.

Kemeny and Snell (1962) first studied the use of distance measures in rank aggregation

and proposed their own set of axioms, similar to Arrow‘s. A distance measure is a measure of

how close two vectors are to each other. To illustrate, we describe the ℓ1-metric, also known as

the Manhattan- or Cityblock-metric. Assuming that x and y are two n-dimensional vectors,

Equation 1: ℓ1-metric

ℓ1 𝑥, 𝑦 = 𝑥𝑖 − 𝑦𝑖

𝑛

𝑖=1

12

Similar to this is the Spearman footrule distance, which is used for ordinal data, or ranks, instead

of quantitative data. If A orders his preference among n items in vector x and B orders his

preference on the same n items in vector y, the footrule distance of preference between A and B

is:

Equation 2: Spearman footrule distance

Bogart (1973, 1975) studied distance measures applied to partial orderings. For a set of n

items, a partial ordering is a ranking where only a subset of the n items are ranked. Also, Cook

and Seiford (1978) limited their study to rankings where no ties are allowed, called complete

ordinal rankings, and developed axioms similar to Kemeny and Snell‘s. About two decades later,

a general model for drawing a distance-based consensus was introduced (Cook et al., 1996).

The last area identified concerning previous research on rank aggregation methods is the

development of different frameworks. Researchers have developed heuristics and used methods

such as data envelopment analysis (DEA) and extreme-point approaches to arrive at a consensus.

Methods like these most often have mathematical programming as an essential part of the

solution process. As an example of this kind of work, Cook and Kress (1990) proposed a DEA

model for aggregating preference rankings, and found it to be equivalent to the BK method under

certain circumstances.

For this project, we focus on five aggregation methods from the literature. Four methods

are outlined in the following subsections, and the fifth method is the BK method, discussed

earlier. These techniques have been selected out of all methods studied because they are not

overly specific; they are designed for broad applications. For example, the chosen methods

basically require numerical ranking data, which are lists of permutations of 1, 2, 3, …, n, if there

𝑑𝐴𝐵 = 𝑥𝐴𝑖 − 𝑦𝐵𝑖

𝑛

𝑖=1

13

are n items to be ranked. On the other hand, a more specific method may require fuzzy

preference relations to be defined, or bipolar preferences (Peneva and Popchev, 2007; Öztürk and

Tsoukiàs, 2008). Our current problem would not fit very well into the more specific methods

found in the literature. Also, the chosen methods are very adaptable and flexible. For example, in

these methods, some are able to handle a set of rankings where ties are present. In others, the

decision maker can specify a desired result through defining a parameter or additional preference

for the weights to satisfy, if it is a weight-determining method. Finally, we chose different types

of methods: some weight-determining, some DEA-based, and some utilizing distance measures.

Different types were chosen in order to obtain results according to different decision rules. This

way, if similar results are achieved, it will not be due to unintentional, repeated evaluations under

the same decision rule.

2.4.1 Aggregation through the Estimation of Utility Intervals

In this method, constructed by Wang et al. (2005), individual preference rankings are

viewed as constraints on utilities and linear programming (LP) models are used to estimate the

utility intervals. Then, a weighted average sum is used to aggregate the intervals for each

alternative. Finally, a simple, yet practical interval comparison method is used to determine the

overall ranking. The interval comparison method also provides information on the degree to

which one interval is preferred to another, and in the final ranking, gives a percentage of how

much a higher-ranked alternative is preferred to a lower-ranked alternative. This method is

suggested for group decision-making, social choice, and committee elections. It has been

previously used in voting systems (Tamiz and Foroughi, 2007).

14

2.4.2 Parallel Rank Aggregation

Beg (2004) introduces a method for rank aggregation by optimizing the Spearman

footrule distance between the aggregated ranking and the rankings from the DMs, which is called

footrule optimal aggregation (FOA). Spearman footrule distance is a measure of the difference

between two rankings. It is applied to meta-searching on the World Wide Web, where the

decision maker‘s preferences are ranked lists from different search engines. Since these lists may

be very lengthy, partial lists are used to determine the aggregation. Because FOA on partial lists

is NP-hard, Beg proposes a genetic algorithm approach as a better method for solution, despite

that it may have a longer computation time. Beg also shows that the genetic algorithm technique

performs better than the conventional Borda-Kendall method (Beg, 2004).

2.4.3 Three Mathematical Programming Models

Wang et al. (2007a) suggests another weight-determining model for rank aggregation.

Even though three models are proposed, two LP models and one nonlinear programming (NLP)

model, we consider them as a set from which the DM should choose only one. The models are

straightforward and simple to use, providing objective weights, as well as final rankings, and not

requiring any parameter to be specified by the DM. Further, the models put more emphasis on the

1st ranking place, using the strong ordering constraint 𝑤1 ≥ 2𝑤2 ≥ ⋯ ≥ 𝑚𝑤𝑚 , where wj= the

weight of the jth ranking place. Although no parameters are specified, which usually play a role

in determining the 2nd

through mth places, their results show that the models produce strong, stable

final rankings. LP-1 and LP-2 maximize the minimum total scores of all n items. The differences

in the models are that the LPs generate the same set of weights for all alternatives and the NLP

determines the most favorable weights for each alternative. Also, LP-1 requires that the weights

15

sum to 1, while LP-2 does not, and LP-2 requires that each alternative‘s score be less than or

equal to 1, while LP-1 does not. All three are adapted from a data envelopment analysis (DEA)

model by Cook and Kress (1990). They have previously been used in preferential voting and

election systems.

2.4.4 Aggregation using Ordered Weighted Averaging (OWA) Operator Weights

For this method, the basic premise of a traditional rank aggregation method holds, where

different ranking places are assigned a weight representative of its importance to the overall

solution, and the overall aggregation is achieved through a simple weighted sum. The weights are

also normalized. ―OWA operators … provide a unified framework for decision making under

uncertainty, where different decision criteria … are characterized by different OWA operator

weights‖ (Wang et al., 2007b, pp. 3357). Thus, the authors propose using OWA operator weights

because they suggest a weighted average sum is similar enough to an OWA operator to warrant

using its weights. Orness, a measure associated with the weight vector, is a value in the interval

[0, 1] and assesses the degree to which the DM emphasizes the higher ranking places. It is termed

the optimism level, α, of the DM. For example, an optimism level of 1 means that all weight is

placed on the 1st ranking place and an optimism level of 0.5 ensures that all ranking places are

equally considered. This method can be used with varying optimism levels toward the same

problem, giving the DM the opportunity to choose an appropriate solution by optimism level and

view the stability of the solution in terms of the results given by other optimism levels. Also, the

paper proves that the Borda-Kendall method corresponds to α = 2/3 (Wang et al., 2007b). This

method provides more choices and flexibility for DMs than BK, and has previously been used in

preferential voting and election systems, in parameterized estimation of fuzzy random variables,

and in querying systems of a hospital‘s database (Liu, 2009; Wang et al., 2006). Further, OWA

16

operators have been used in aggregating criteria functions in multicriteria decision-making

(Yager, 1988).

The literature review provides the background for the work presented in this thesis and

the general subject area— triage decision-making. The methods chosen, in addition to BK, are the

most conducive for our data, and the most fitting preference aggregation method will be selected.

CHAPTER THREE

3 DATA COLLECTION AND METHODOLOGY

The study is broken into pieces that connect to explain why this work is necessary, and to

explain the research methodology. See Figure 2. After data collection, the need for aggregation is

justified through illuminating the discrepancies that exist in the data. Then, utility theory is

applied to determine the method with the highest utility, according to the DM‘s preferences. The

application of the methods to the data follows. An expert examines the results gained from

applying the methods and suggests the method that performs the best in practice. The decisions

from the expert and utility theory are both considered in the recommendation of the best method

for preference aggregation of ED nurse triage prioritizations.

Figure 2: Research Methodology

Aggregation Study

Data Collection

Discrepancy Analysis

Application of Methods

to Ranking Data

Preference Selection by

way of Utility Theory

Expert Assessment of

Results

Recommendation

18

3.1 Data Collection

Data was collected in three clinical settings: Susquehanna Health Williamsport Hospital

(SHWH), Mount Nittany Medical Center (MNMC), and Hershey Medical Center (HMC). In

order to not impact clinical activities adversely, our team has visited the EDs at these locations

during their off-peak operational times (e.g., 4:00 – 6:00 am). Each interview took approximately

30 minutes. As the structured interview protocol, a set of questions were prepared, and are

provided in Appendix A: Interview Protocol, which aimed at understanding the background and

training level of the triage nurses as well as their professional opinion on clinical issues (e.g., the

relative importance of vital signs, etc). Finally, the interview ended with a 3-minute exercise,

shown in Figure 3, where interviewees were asked to provide the ESI level and priorities for 8

patients, for which we only provided the vital signs, age, and gender data.

The vital signs included were temperature (°F), heart rate (beats/minute), respiration rate

(breaths/minute), systolic blood pressure (mm Hg), and diastolic blood pressure (mm Hg). All the

hospitals used ESI and as nurses completed the exercise, they could use the ESI algorithm, if

desired. The 8 patient scenarios of the exercise were constructed so that all patients would be

viewed as semi-urgent or non-urgent, which most likely suggests an ESI level categorization of 3,

4, or 5. There are less obvious distinguishing factors among these patients as opposed to severely

acute patients, especially since symptoms and conditions are not provided. As mentioned earlier,

overcrowded ED waiting areas are mainly comprised of patients who would fall in these

categories. So, results from a study that highlights this group would be most beneficial to

possibly alleviate problems in the ED, which is the reason a data set limited to this group of

patients was created. At the conclusion of data collection, the total number of nurses interviewed

was 36. 14 nurses were interviewed at SHWH, 12 at MNMC, and 10 at HMC. The Pennsylvania

State University‘s Institutional Review Board (IRB) granted approval to this study as IRB# 29351

19

– ―Triage Decision-Making.‖ The preliminary findings on these ESI level assignments and

prioritization are reported in Fields et al. (2009).

Patient

#

Gender

Age

Temperature

(° F)

Pulse

(beats

per

min.)

Respiration

Rate

(breaths

per min.)

Systolic

Blood

Pressure

(mm Hg)

Diastolic

Blood

Pressure

(mm Hg)

ESI

Rank

1 M 18 101.8 97 26 125 79

2 M 40 98.0 110 20 150 92

3 F 25 99.1 94 28 120 80

4 F 7 98.3 115 18 145 90

5 F 33 101.2 75 23 130 85

6 M 24 97.8 92 29 120 81

7 F 3 100.7 80 25 128 83

8 M 55 97.8 80 18 125 96

Figure 3: ESI and Ranking Exercise

3.2 Methods

3.2.1 Exploration of Discrepancies in Decision-Making

To show that there is a need for preference aggregation of nurse prioritization data, the

discrepancies that exist are illuminated. After collection and tabulation of the data points,

Spearman‘s footrule is used to measure the ranking (prioritization) differences. Microsoft Excel

and Minitab are used to perform the calculations and achieve descriptive statistics. To get an

accurate picture of the relationships, we calculated the footrule distances for all possible pairwise

combinations of nurses in each hospital, and across each pair of hospitals. The rankings and

20

footrule distance data appear in Section 4.1 and in Appendix C: Remaining Footrule and

Correlation Data. Then, mean and variance values were calculated and boxplots were constructed

to examine the shapes and central tendencies of the data. We have also computed the Student T-

test (two-tailed) probabilities for each location pair to discern if variation in the footrule values

across hospitals were significantly different. Then, for each hospital, we found the Spearman rank

correlation coefficient for all possible pairs of prioritizations. This non-parametric rank statistic

was proposed by Spearman in 1904 as a measure of the strength of association between two

variables (Lehmann and D‘Abrera, 1998). For n items, if there are no ties, the rank correlation

coefficient, ρ, is given by:

Equation 3: Spearman Rank Correlation Coefficient

where di = xi - yi = the difference between the ranks of corresponding values Xi and Yi. After that,

we found the correlations across hospitals, pairwise comparing the rankings from SHWH with

those of MNMC, SHWH with HMC, and MNMC with HMC.

3.2.2 Utility Theory

Utility theory is essential here because the DM preferences for the various methods

considered are dependent on the present situation, and we will be considering trade-offs among

the alternatives under uncertainty. For this portion of the work, the researcher also serves as the

DM. In particular, the aspects of the present situation framing this problem are the required level

of expertise of the decision maker and time schedule for completion of the larger project. Under

different circumstances, the problem could be considered a value theory problem, where

preferences and trade-offs are considered under certainty, or either an optimization problem.

𝜌 = 1 −6 𝑑𝑖

2𝑛𝑖=1

𝑛 𝑛2 − 1

21

3.2.2.1 Decision Context and Objectives

For the decision context of selecting a preference aggregation method, there are two main

objectives. One is to maximize the representation of all individual nurse opinions in the

aggregation. This means that a method is desired that produces an aggregation which reflects the

true wishes of the nurses. The term now used is accuracy. The method should not be sensitive to

data structures that have symmetries. The reason is that if the effects of these symmetries are not

recognized and cancelled, there will be some bias in the solution. Also, it would be helpful for the

decision maker to know what the accuracy level of the solution is, or be able to specify an

accuracy level if they would like. The other objective is to maximize the ease of use of the

method. Having a method that is easy to use will provide solutions quickly and with little effort.

If the project is extended and the method used in an actual Emergency Department, the decision

makers will be able to benefit from the solution without much cost (e.g., time and energy not

spent on patients).

3.2.2.2 Hierarchical Model of Objectives

The two fundamental objectives, to maximize representation of all nurses‘ opinions and

maximize ease of use, can be further divided into more specific objectives. These more specific

objectives are means to achieving the fundamental objectives, termed means objectives. These

more specific objectives give us a way to discern the qualities most indicative of an alternative‘s

success. From these, we determine the attributes. Our hierarchical model, a means-end network,

is Figure 4 and will be described next.

22

Figure 4: Hierarchical Model of Objectives, with attributes in parentheses.

Objectives and attributes relate closely to one another; therefore, in addressing the means

objectives, we simultaneously explain the basis of their attributes as well. It was especially

difficult to separate the two, in considering our first objective. The attributes will be further

addressed in the next subsection. There are three means objectives for the first major objective.

Previously, the term accuracy represented the first major objective. From this point on, accuracy

will be a means objective that refers to a quality of the method under consideration. We desire to

maximize accuracy, and a method that possesses a good way of doing so is sought. For this, it is

Maximize usefulness of an aggregation

method for emergency room

nurse triage decisions

Maximize representation of

all nurses opinions in a

triage decision

Accuracy/ How does method designate accuracy in satisfying all or most

preferences (scale of 1 to 7)

Improvements over other methods (scale of 1 to 7)

Flexibility/ May suit a variety of data types(scale of 1 to 7)

Maximize ease of use

Time to achieve solution/ Effort required (scale of 1 to

7)

Complexity of process (number of steps)

Implementation (scale of 1 to 7)

23

also an asset for a method to have a feature that presents the accuracy of the solution to the DM.

Another means is to maximize improvement over other methods. The improvements one method

provides over its predecessors speak to its capabilities of producing a stronger result. Finally, we

prefer to maximize flexibility in handling different types of data, such as incomplete ranking data

or data that may include ties from one or more DMs.

As it relates to maximizing ease of use, three means objectives are identified. One is to

minimize the time required to achieve a solution and the effort required to use the method. Effort

is defined as the energy one puts forth in understanding the concepts and applying them in the use

of the method. Another is to minimize the complexity of the method. Finally, an aggregation

method that is simple to implement is preferred. The method should be easily implemented on

widely-used computer programs or existing software.

3.2.2.3 Specifying Attributes

Because many of the means objectives are relative to the DM‘s ability and perspective

and have not been experimented with, we develop constructed attributes to measure the important

aspects of the alternatives. A good characterization of the DM‘s ability is one who possesses

intermediate/advanced knowledge and skill in widely-known and available math programming

software, advanced expertise with Microsoft Excel, and beginner/intermediate knowledge and

proficiency of the C++ programming language. Alongside this is the DM‘s perspective, which is

not that of a triage nurse, but of a potential patient considering hypothetically what would be most

helpful to the nurse assigned for her care. The DM‘s perspective is also influenced by discussions

had with nurses participating in this study. The discussions helped determine which objectives

would be most helpful to achieve and how the methods should be assessed, taking into account

the perceived learning curve and ability level of the average triage nurse. For each objective

24

besides complexity of process, we used a scale from 1 to 7. The meaning of the numbers of the

scale is in Table 3. For complexity of process, the attribute is number of steps in the method and a

smaller number of steps is preferred. Even though it is not a scale, this is still considered a

constructed attribute because the authors did not provide lists of steps in the literature, which led

the DM to interpret steps from the procedures described for each method. In addition, the DM

will not have practiced the methods, so the construction of the measurements is inherently

subjective. This creates uncertainty in the problem and validates why utility theory is an excellent

way to examine the problem. The performance matrix of the five alternatives appears in Table 4.

Again, all means objectives are to be maximized, except complexity. The detailed, full assessment

of alternatives versus the objectives appears in Appendix B: Assessment of Alternatives.

Table 3: Scale for Attributes

Number Meaning

7 Excellent

6 Very Good

5 Good

4 Neutral

3 Fair

2 Poor

1 Unacceptable

Table 4: Performance matrix

Attributes

Alternatives 1.

Acc

ura

cy

2.

Imp

rov

emen

t

3.

Fle

xib

ilit

y

4.

Tim

e/E

ffo

rt

5.

Co

mp

lex

ity

6.

Imp

lem

enta

tio

n

1. Utility intervals 6 7 7 5 6 7

2. Genetic Algorithm 6 5 3 2 3 2

3. OWA operator 7 7 4 6 6 7

4. 3 MP models 2 5 4 6 3 7

5. BK Method 3 3 2 5 2 7

Maximize representation

of all opinions Maximize ease of use

25

3.2.2.4 Single Attribute Utility (SAU) Functions

To determine the individual preferences of the attributes, the DM employed lottery

questions, as suggested by Raiffa and Keeney (1976) and Hazelrigg (1996). In these, the DM

searches for a point where she would be indifferent between a lottery of the best and worst

outcomes, and accepting a guaranteed value. This point is called the certainty equivalent (CE) and

is the probability p at which the DM is indifferent between a guaranteed result, a certainty value,

and the lottery, where there is a probability p of obtaining the best value and probability 1-p of

obtaining the worst value. For every attribute measured by the 1 to 7 scale, the best payoff in their

respective lottery is 7 and the worst payoff is 1. Thus, the utility of 7 is 1, u(7) = 1, and the utility

of 1 is 0, u(1) = 0. For the complexity attribute, the best payoff is 2 steps and the worst payoff is

10 steps. Various certainties and certainty equivalents were determined and the DM preferences

for all attributes, except complexity, reflect a risk-averse attitude. The complexity attribute

follows a risk-prone attitude. The exponential function Ui(xi) = A-Be(-x

i/RT)

is used to represent the

utility function of attribute i, where RT is the risk tolerance and A and B are parameters

guaranteeing that the function results in values between 0 and 1. In order to determine the correct

function to fit the lotteries with the associated certainties and certainty equivalents, we employed

the Goal Seek function for the utility of the CE in Microsoft Excel, varying the RT value. See

Figure 5 - Figure 10 and Table 5 - Table 10 for the risk curves and the utility functions.

Table 5: Lottery and SAU Evaluation for Accuracy

Expected

Utility U1(x) Alternative Consequence U1(x)

Best 7 1 1 1. Utility intervals 6 0.937157

Certainty 5 0.85 0.849832 2. Genetic Algorithm 6 0.937157

Worst 1 0 0 3. OWA operator 7 1

4. 3 MP models 2 0.325575

5. BK Method 3 0.559875

Lottery

Accuracy

26

Figure 5: SAU Function for Accuracy

Table 6: Lottery and SAU Evaluation for Improvement

Expected

Utility U2(x) Alternative Consequence U2(x)

Best 7 1 1 1. Utility intervals 7 1

Certainty 5 0.8 0.800234 2. Genetic Algorithm 5 0.800234

Worst 1 0 0 3. OWA operator 7 1

4. 3 MP models 5 0.800234

5. BK Method 3 0.488005

Lottery

Improvement

Figure 6: SAU Function for Improvement

Table 7: Lottery and SAU Evaluation for Flexibility

Expected

Utility U3(x) Alternative Consequence U3(x)

Best 7 1 1 1. Utility intervals 7 1

Certainty 4 0.9 0.899979 2. Genetic Algorithm 3 0.778458

Worst 1 0 0 3. OWA operator 4 0.899979

4. 3 MP models 4 0.899979

5. BK Method 2 0.525706

Lottery

Flexibility

0

0.5

1

1.5

0 2 4 6 8

Uti

lity

Consequence

Accuracy: U1(x1)=1.161-

1.614(-x1

/3.04)

00.5

11.5

0 2 4 6 8

Uti

lity

Consequence

Improvement: U2(x2)=1.355-1.694(-x

2/4.478)

27

Figure 7: SAU Function for Flexibility

Table 8: Lottery and SAU Evaluation for Time/Effort

Expected

Utility U4(x) Alternative Consequence U4(x)

Best 7 1 1 1. Utility intervals 5 0.625

Certainty 5 0.6 0.625 2. Genetic Algorithm 2 0

Worst 1 0 0 3. OWA operator 6 0.8125

4. 3 MP models 6 0.8125

5. BK Method 5 0.625

Lottery

Time/Effort

Figure 8: SAU Function for Time/Effort

Table 9: Lottery and SAU Evaluation for Complexity

Expected

Utility U5(x) Alternative Consequence U5(x)

Best 2 1 1 1. Utility intervals 6 0.829881

Certainty 5 0.9 0.899881 2. Genetic Algorithm 3 0.978675

Worst 10 0 0 3. OWA operator 6 0.829881

4. 3 MP models 3 0.978675

5. BK Method 2 1

Lottery

Complexity

0

0.5

1

1.5

0 2 4 6 8

Uti

lity

Consequence

Flexibility: U3(x3)=1.013-

2.106(-x3

/1.366)

00.5

1

0 2 4 6 8

Uti

lity

Consequence

Time/Effort: U4(x4)=5.629x1014 -

5.629x1014(-x4

/3.363x1015)

28

Figure 9: SAU Function for Complexity

Table 10: Lottery and SAU Evaluation for Implementation

Expected

Utility U6(x) Alternative Consequence U6(x)

Best 7 1 1 1. Utility intervals 7 1

Certainty 5 0.65 0.666667 2. Genetic Algorithm 2 0.166667

Worst 1 0 0 3. OWA operator 7 1

4. 3 MP models 7 1

5. BK Method 7 1

Lottery

Implementation

Figure 10: SAU Function for Implementation

3.2.2.5 Multiattribute Utility (MAU) Function

In order to aggregate the SAUs so that the alternatives may be evaluated, we employ a

typical multiattribute utility function (Raiffa and Keeney, 1976):

0

0.5

1

1.5

0 2 4 6 8

Uti

lity

Consequence

Complexity: U5(x5)=

1.044 - 0.02(x5

/2.524)

012

0 5 10

Uti

lity

Consequence

Implementation: U6(x6)=2929630 -

2929630.2(-x6

/17577777)

29

Equation 4: Multiplicative Multiattribute Utility Function

The scaling constants ki, for each attribute i, are determined for use with the multiattribute

utility function. Once the scaling constants are found, K, the scaling factor, can be determined and

the overall utility for the five alternatives may be evaluated. We used the method suggested by

Keeney and Raiffa (1993) and first determined accuracy to be the most important attribute. Then

the indifference point was determined to be 0.6 (= p = k1) for the lottery presented in Figure 11.

Figure 11 depicts the question: for what probability p is the decision maker indifferent between a

certainty and a lottery where there is a p chance of achieving an alternative that has the highest

utility for each attribute and a 1-p chance of achieving an alternative that garners the lowest utility

for each attribute? The certainty is an alternative having the highest utility for accuracy but the

lowest utility for all other attributes.

Figure 11: Determining k1 for Accuracy

𝑈 𝑥 =1

𝐾 𝐾𝑘𝑖𝑈𝑖 𝑥𝑖 + 1

𝑛

𝑖=1

− 1

𝑤𝑕𝑒𝑟𝑒: 1 + 𝐾 = (1 − 𝐾𝑘𝑖)𝑛𝑖=1 ,

𝑈𝑖 𝑥𝑖 = 𝑆𝐴𝑈 𝑓𝑜𝑟 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑖,

𝑘𝑖 = 𝑠𝑐𝑎𝑙𝑖𝑛𝑔 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡 𝑓𝑜𝑟 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝑖,

𝑎𝑛𝑑 𝐾 = 𝑠𝑐𝑎𝑙𝑖𝑛𝑔 𝑓𝑎𝑐𝑡𝑜𝑟.

Lottery Certainty

30

The k values for the remaining attributes were found relative to k1 through the use of

another question, which is now described. Using the improvement attribute as an example, we

chose a level of the improvement attribute for which we are indifferent between a consequence

yielding the best accuracy level and worst improvement level, and a consequence yielding the

worst accuracy level and the chosen improvement level. The chosen attribute is not necessarily

reflected in the alternative set, but the worst attribute level must be one that is reflected in the

alternative set. Then the SAU of the chosen improvement level is multiplied by k1 to determine k2.

Figure 12 illustrates this and Figure 13 displays the remaining k values below.

Figure 12: Determining k2 for Improvement

Figure 13: Remaining k Values.

Next, the k values are used to determine K, via Equation 4. We found K to be -0.982. If K

= 0, the multiattribute function is equivalent to its additive form which assumes that all attributes

are mutually utility independent. Since K ≠ 0, there is some interaction in attribute preference and

the multiplicative form is most appropriate to use.

(7,3) ~ (2,5)

𝑘2𝑢1 7 = 𝑘1𝑢2 5

𝑘21 = 0.6𝑢2 5

𝑘2 = 0.4801

𝑘2 = 0.480

𝑘4 = 0.375

𝑘6 = 0.3

𝑘3 = 0.57

𝑘5 = 0.587

31

3.2.3 Application of Aggregation Methods to Data

After applying utility theory to determine the most desirable aggregation method, we

applied the methods to the data to explore their use in practice. First, they are applied to the data

of each hospital separately, then to the set of all data combined. Microsoft Excel and Excel Solver

are used for the calculations and solutions to the linear and nonlinear programs. The genetic

algorithm method was not applied because of irreconcilable difficulty in implementing code from

the literature. Since the BK method corresponds to a special case of the OWA operator weight

method, it was only necessary to apply three methods: estimation of utility intervals, OWA

operator weight-determination, and the three mathematical programming models. For ease of

illustration, in determining the nurses‘ utility intervals, we assume the order relations of their

preference rankings to be a complete weak order. This means that if alternative i is ranked

immediately higher than alternative j, then i is not inferior to j, as opposed to being strictly

preferred to j. Both LPs and the NLP from Wang et al., 2007a are applied. Additionally, we

determined OWA operator weights using the minimax disparity approach, developed by Wang

and Parkan (2005), for DM optimism levels 1, 0.9, 0.8, 0.7, 0.6, and 2/3.This approach was only

chosen for convenience and it ―minimizes the maximum disparity between two adjacent weights

under a given [optimism level]‖ (Wang et al., 2007b, pp. 3358).

3.2.4 Expert Judgment on Applied Methods

Finally, we sought evaluation from a participant with expert judgment. The expert judge

has over thirty years of experience in advanced assessment in diagnosis in primary care

situations, and also trains new and inexperienced nurses on the process. Our expert prioritized the

fictional patients as did the other participants. Following this, one method was presented and its

32

results were explained. Then the expert was asked to comment on the suitability and acceptability

of the method and results, in addition to providing any other thoughts. This was repeated for all

the methods that were applied. The only results shown were that of all data combined, not the

results separated by hospital. Next, with all methods and results in front of her together, the

expert was asked which method best performed the aggregation and why. Then it was discovered

how much of her response was due to the method used or the result obtained and why. Opinions

on the second-best method and least acceptable method were sought to reveal further insight.

Finally, we explored her perceptions on the similarities or differences the results may have had to

her own prioritization. The expert‘s opinion is considered very highly in making a

recommendation of the best preference aggregation method for our data.

CHAPTER FOUR

4 RESULTS AND DISCUSSION

In this chapter, we present and discuss the results of the methods implemented in Chapter

3. The data is analyzed and aggregated rankings are determined. We also indicate which

aggregation method is most recommended. The following three tables show our prioritization

data sets.

Table 11: Prioritization Data Set 1 - Rankings from Susquehanna Health Williamsport Hospital

Nurse 1 Nurse 2 Nurse 3 Nurse 4 Nurse 5 Nurse 6 Nurse 7 Nurse 8 Nurse 9 Nurse 10 Nurse 11 Nurse 12 Nurse 13 Nurse 14

Patient 1 6 3 5 6 2 4 4 3 1 2 5 5 3 3

Patient 2 1 1 2 4 6 2 6 4 3 3 4 4 7 1

Patient 3 7 6 4 2 4 7 2 6 6 6 8 7 2 6

Patient 4 2 2 7 1 1 3 3 2 2 1 2 2 1 2

Patient 5 3 4 6 7 5 5 7 5 4 8 3 6 6 4

Patient 6 8 7 3 3 3 1 1 8 7 4 6 3 4 7

Patient 7 4 5 1 8 7 6 5 1 5 5 1 1 5 5

Patient 8 5 8 8 5 8 8 8 7 8 7 7 8 8 8

Susquehanna Health Williamsport Hospital

Table 12: Prioritization Data Set 2: Rankings from Mount Nittany Medical Center

Nurse 1 Nurse 2 Nurse 3 Nurse 4 Nurse 5 Nurse 6 Nurse 7 Nurse 8 Nurse 9 Nurse 10 Nurse 11 Nurse 12

Patient 1 1 2 8 2 2 3 1 5 3 5 3 1

Patient 2 7 6 2 7 7 7 3 4 5 3 7 2

Patient 3 3 7 3 8 5 5 4 2 4 2 2 5

Patient 4 5 4 1 1 6 6 5 3 6 7 6 3

Patient 5 6 3 6 3 3 2 6 6 7 6 5 6

Patient 6 2 8 5 5 4 4 2 1 2 1 1 4

Patient 7 4 1 4 6 1 1 7 7 1 4 4 7

Patient 8 8 5 7 4 8 8 8 8 8 8 8 8

Mount Nittany Medical Center

Table 13: Prioritization Data Set 3: Rankings from Hershey Medical Center

Nurse 1 Nurse 2 Nurse 3 Nurse 4 Nurse 5 Nurse 6 Nurse 7 Nurse 8 Nurse 9 Nurse 10

Patient 1 1 6 1 3 1 2 4 5 3 2

Patient 2 4 2 3 6 4 3 3 1 2 3

Patient 3 5 7 4 1 6 6 7 3 6 4

Patient 4 6 4 5 2 3 4 5 2 1 5

Patient 5 3 5 2 4 2 1 6 7 8 7

Patient 6 7 8 6 5 7 8 1 8 4 1

Patient 7 2 3 7 7 5 5 2 4 5 6

Patient 8 8 1 8 8 8 7 8 6 7 8

Hershey Medical Center

34

4.1 Discrepancy Analysis

For the presented data in Table 14 and the remaining footrule data in Appendix C:

Remaining Footrule and Correlation Data, the mean and standard deviation values were

calculated. These are presented in Figure 14 along with respective boxplots. Lower means and

smaller spreads are preferred. We have also computed the Student T-test (two-tailed) probabilities

for each location pair to discern if disparities in the footrule values across hospitals were

significantly different. Table 15 presents these probabilities. For the results presented in Figure 14

and Table 15, all pairwise permutations of the ranking differences were taken into account. As

can be seen in Table 15, given α=0.05, all three data sets (SHWH, MNMC, HMC) are

significantly different from each other.

Table 14: Footrule Data Set 1 - Pairwise Comparisons among all SHWH Nurses

S: Nurse 1 S: Nurse 2 S: Nurse 3 S: Nurse 4 S: Nurse 5 S: Nurse 6 S: Nurse 7 S: Nurse 8 S: Nurse 9 S: Nurse 10 S: Nurse 11 S: Nurse 12 S: Nurse 13 S: Nurse 14

S: Nurse 1 0 10 24 22 26 18 28 14 14 20 12 18 26 10

S: Nurse 2 0 20 24 16 12 20 10 4 12 14 16 16 0

S: Nurse 3 0 22 20 16 18 18 22 20 18 10 20 20

S: Nurse 4 0 14 20 14 24 24 16 24 18 14 24

S: Nurse 5 0 14 12 18 14 12 22 16 8 16

S: Nurse 6 0 12 18 14 14 18 12 18 12

S: Nurse 7 0 22 20 16 24 16 8 20

S: Nurse 8 0 10 14 8 10 18 10

S: Nurse 9 0 10 14 16 16 4

S: Nurse 10 0 18 14 12 12

S: Nurse 11 0 8 22 14

S: Nurse 12 0 16 16

S: Nurse 13 0 16

S: Nurse 14 0

Figure 14: Boxplots, Means, and Standard Deviations of Footrule Calculations

M vs. HS vs. HS vs. MH vs. HM vs. MS vs. S

35

30

25

20

15

10

5

0

Location Pair

Foo

tru

le S

co

re

Boxplot of S vs. S, M vs. M, H vs. H, S vs. M, S vs. H, M vs. H

Data Sets Mean Standard

Deviation

SHWH vs. SHWH 13.8667 7.35510

MNMC vs. MNMC 14.7949 8.84111

HMC vs. HMC 14.5818 8.35927

SHWH vs. MNMC 18.1786 5.08529

SHWH vs. HMC 17.0143 5.39382

MNMC vs. HMC 18.5000 5.41551

35

Table 15: T-test probabilities

MNMC HMC

SHWH 0.452 0.594

MNMC 0.888

Table 16 provides a sample chart of the correlation pairings for SHWH and MNMC.

Table 17 summarizes the correlation results. If the absolute value of the Spearman coefficient is ≥

0.7, it is considered a high correlation and we conclude that the two rankings are very similar.

Conversely, if the absolute value of the coefficient is ≤ 0.3, it is considered a low correlation and

the two rankings are significantly different.

Table 16: A Sample Chart of Correlation Pairing - SHWH vs. MNMC

M: Nurse

1

M: Nurse

2

M: Nurse

3

M: Nurse

4

M: Nurse

5

M: Nurse

6

M: Nurse

7

M: Nurse

8

M: Nurse

9

M: Nurse

10

M: Nurse

11

M: Nurse

12

S: Nurse 1 -0.667 0.357 0.476 0.190 -0.262 -0.190 -0.310 -0.333 -0.429 -0.452 -0.762 0.119

S: Nurse 2 -0.048 0.310 0.452 0.238 0.048 0.024 0.357 0.119 -0.024 -0.071 -0.262 0.714

S: Nurse 3 0.333 -0.024 0.286 -0.667 0.452 0.429 0.310 0.238 0.810 0.786 0.429 0.167

S: Nurse 4 0.071 -0.714 0.595 -0.024 -0.595 -0.619 0.310 0.762 -0.167 0.190 0.095 0.357

S: Nurse 5 0.643 -0.095 0.190 0.500 0.190 0.119 0.667 0.690 0.190 0.143 0.476 0.714

S: Nurse 6 0.286 -0.286 0.310 0.167 0.048 0.024 0.690 0.619 0.310 0.429 0.238 0.714

S: Nurse 7 0.786 -0.429 0.333 -0.119 0.238 0.167 0.619 0.881 0.619 0.667 0.810 0.429

S: Nurse 8 0.071 0.786 0.310 0.262 0.429 0.381 -0.095 -0.286 0.262 -0.262 -0.190 0.238

S: Nurse 9 0.238 0.500 0.167 0.476 0.286 0.214 0.452 0.071 0.071 -0.167 -0.071 0.762

S: Nurse 10 0.381 0.048 0.381 0.333 0.000 -0.143 0.595 0.429 0.333 0.095 0.071 0.810

S: Nurse 11 -0.119 0.690 0.310 0.357 0.429 0.476 -0.238 -0.286 0.167 -0.262 -0.238 0.048

S: Nurse 12 0.286 0.286 0.476 0.119 0.429 0.405 0.143 0.214 0.619 0.238 0.190 0.262

S: Nurse 13 0.690 -0.024 0.357 0.238 0.286 0.214 0.429 0.643 0.357 0.214 0.571 0.452

S: Nurse 14 -0.048 0.310 0.452 0.238 0.048 0.024 0.357 0.119 -0.024 -0.071 -0.262 0.714

Hi Corr 15 9%

Med Corr 68 40%

Low Corr 85 51%

Total 168 Comparisons

Table 17: Spearman Rank Correlation Results

# % # % # %

SHWH vs. SHWH 11 12% 43 47% 37 41%

MNMC vs. MNMC 7 11% 28 42% 31 47%

HMC vs. HMC 5 11% 14 31% 26 58%

SHWH vs. MNMC 15 9% 68 40% 85 51%

SHWH vs. HMC 16 11% 63 45% 61 44%

MNMC vs. HMC 14 12% 48 40% 58 48%

High Medium Low

36

For Susquehanna Health Williamsport Hospital (SHWH), 88% of the rankings have large

differences from each other, with medium to low correlations. For Mt. Nittany Medical Center

(MNMC) and for Hershey Medical Center (HMC), 89% is largely different. This would suggest

that at each hospital, there is significant variation in the assignment of priorities to patients.

Overall, the results are consistent in the comparisons across hospitals as well. The Susquehanna

rankings and the Mt. Nittany rankings are the most dissimilar to each other out of all three

pairings, with 51% having low correlations and only 9% having high correlations. Comparisons

with the Mt. Nittany rankings result in the highest dissimilarity. This means that comparisons

with MNMC turn out to be more dissimilar than comparisons with HMC or with SHWH.

Comparisons to Hershey rankings suggest slightly more similarity with the other two hospitals,

but not enough to outweigh the dissimilarity between most of the rankings.

Given the analysis provided above, we conclude that the same vital signs data, age, and

gender information led the nurses interviewed to significantly different patient prioritizations

within the same ED, and across the three EDs studied. While the interview recordings have not

been transcribed yet, we do not expect to be able to attribute these discrepancies to the training

and formal preparation of the triage nurses. It is believed that these results alone suggest that

preference aggregation is needed and would be a benefit if applied in this area of study.

4.2 Utility Theory-Based Assessment of Methods

Implementation of utility theory for assessing the ranking methods was done as described

in Chapter 3. In order to evaluate the utilities of the five aggregation methods, we combine the

SAUs, K, and ki values in the multiplicative MAU function. The MAU function used is 𝒙 =

1

𝐾 𝐾𝑘1𝑈1 𝑥1 + 1 ∗ 𝐾𝑘2𝑈2 𝑥2 + 1 ∗ 𝐾𝑘3𝑈3 𝑥3 + 1 ∗ 𝐾𝑘4𝑈4 𝑥4 + 1 ∗

𝐾𝑘5𝑈5 𝑥5 + 1 ∗ 𝐾𝑘6𝑈6 𝑥6 + 1 − 1 , where Ui(xi) is the single attribute utility function

37

for attribute i. Including the consequences of the alternatives in the MAU function directly, we

have,

𝑈 𝒙 = 1

−0.982 −0.982(0.6)𝑈1 𝑥1 + 1 ∗ −0.982(0.480)𝑈2 𝑥2 + 1

∗ −0.982(0.57)𝑈3 𝑥3 + 1 ∗ −0.982(0.375)𝑈4 𝑥4 + 1

∗ −0.982)(0.587)𝑈5 𝑥5 + 1 ∗ −0.982)(0.3)𝑈6 𝑥6 + 1 − 1 , 𝑤𝑕𝑒𝑟𝑒

𝑈1(𝑥1) = 0.9371, 0.9371, 1, 0.3256, 0.5599

𝑈2(𝑥2) = 1, 0.8002, 1, 0.8002, 0.4880

𝑈3(𝑥3) = 1, 0.7785, 0.9, 0.9, 0.5257

𝑈4(𝑥4) = 0.625, 0, 0.8125, 0.8125, 0.625

𝑈5(𝑥5) = 0.8299, 0.9787, 0.8299, 0.9787, 1

𝑈6(𝑥6) = 1, 0.1667, 1, 1, 1

In Table 18 the overall scores and final rankings are presented. The recommended

decision is the preference aggregation method using OWA operator weights, with the highest

utility of 0.990. The second-best alternative, with a utility of 0.988, is the method that uses

estimation of utility intervals. The least preferred method is the customary Borda-Kendall

method, with a utility of 0.923.

Table 18: Overall Utility Results

Alternatives 1.

Acc

ura

cy

2.

Imp

rov

emen

t

3.

Fle

xib

ilit

y

4.

Tim

e/E

ffo

rt

5.

Co

mp

lex

ity

6.

Imp

lem

enta

tio

n

Overall

score Ranking

1. Utility intervals 0.937 1 1 0.625 0.83 1 0.988 2

2. Genetic Algorithm 0.937 0.8 0.778 0 0.979 0.167 0.945 4

3. OWA operator 1 1 0.9 0.813 0.83 1 0.990 1

4. 3 MP models 0.326 0.8 0.9 0.813 0.979 1 0.959 3

5. BK Method 0.56 0.488 0.526 0.625 1 1 0.923 5

38

4.3 Results of the Rank Aggregation Methods

In this section, four rank aggregation methods are applied to the data and their results are

analyzed. First, the method estimating utility intervals is applied. It is followed by the method

which determines OWA operator weights and simultaneously treats the BK method. Finally, the

method using three mathematical programs is applied.

4.3.1 Results of the Method utilizing the Estimation of Utility Intervals

Table 19 shows the results for the methods using the estimation of utility intervals. If a

precedes b with the degree of preference P(a>b), it is denoted a 𝑃(𝑎>𝑏)≻

b.

Table 19: Estimation of Utility Intervals Method: Results

Considering data from all three hospitals together, the aggregated ranking shows that

Patient 1 should be seen first, followed by Patient 4, then Patients 7, 6, 2, 3, 5, and 8, in that

order. Patient 1 is preferred to Patient 4 by 50.07%, which suggests that they are difficult to

distinguish. By the rules of this method, if candidate a has a degree of preference to candidate b

that is less than 50%, candidate a should be ranked lower than b. In examining the most highly

prioritized patient across the results, Patient 1 is decided upon most often. The results for

individual hospitals can be interpreted similarly. The procedure of the method requires counting

the number of times each ranking order is given. In this case, there are 8 items ranked, which

Location Aggregated Ranking

SHWH 58.43% 50.59% 57.42% 50.2% 59.37% 53.64% 59.98% 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 4 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 2 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 7 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 6 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 1 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 3 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 5 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 8

MNMC 51.13% 50.26% 59.58% 55.68% 54.11% 51.33% 62.33% 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 1 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 6 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 7 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 4 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 3 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 2 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 5 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 8

HMC 58.7% 52.72% 51.52% 50.33% 54.6% 52.89% 53.27% 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 1 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 2 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 4 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 6 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 5 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 3 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 7 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 8

All Data 𝟓𝟎. 𝟎𝟕% 𝟓𝟑. 𝟏𝟕% 𝟓𝟎. 𝟗𝟖% 𝟓𝟏. 𝟗% 𝟓𝟖. 𝟓% 𝟓𝟏. 𝟔𝟒% 𝟔𝟏.𝟎𝟒% 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟏 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟒 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟕 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟔 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟐 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟑 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟓 ≻ 𝑷𝒂𝒕𝒊𝒆𝒏𝒕 𝟖

39

means there are 8! = 40,320 possible ranking orders. Out of all the 36 ranking orders, only 2

were the same, and those happened to be at the same hospital. Because of this, the utility intervals

are weighted fairly equally, so each patient‘s weighted average utility depended primarily on the

ranking place it was assigned to most often. This shows that for weak order relations, the analyst

may eliminate a step in preparing the data for use with this method. This would decrease the

complexity of the method and make it easier to use. Additionally, having the accuracy feature,

degrees of preference, was very helpful in enhancing the credibility of the method. When they

make triage decisions, it is helpful for the DM to see how strong the preferences are between

adjacently-ranked patients in the results. For example, in the MNMC results, Patient 7 is

prioritized ahead of Patient 4 with the degree of preference of 59.58%, which is fairly strong.

Degrees of preference near to 50%, such as with Patient 6 and Patient 5 for HMC, imply that the

two are nearly indifferent in terms of whether one should be prioritized before the other.

4.3.2 Results of the Method utilizing OWA operator weights, including the Borda-

Kendall Method

The following tables present the aggregation provided by the OWA operator weight-

determination method. The method was applied for optimism levels (α) of 1, 0.9, 0.8, 0.7, 0.6,

and 2/3, which corresponds to the BK method. If Patient a precedes Patient b, it is written a ≻ b

and if Patient a is preferentially indifferent to Patient b, it is written a ~ b.

Table 20: Weights Determined by OWA Operator Weights

OWA wts

α w1 w2 w3 w4 w5 w6 w7 w8

1 1 0 0 0 0 0 0 0

0.9 0.49 0.33 0.17 0.01 0 0 0 0

0.8 0.32381 0.260952 0.198095 0.135238 0.072381 0.009524 0 0

0.7 0.241667 0.208333 0.175 0.141667 0.108333 0.075 0.041667 0.008333

0.666667 0.222222 0.194444 0.166667 0.138889 0.111111 0.083333 0.055556 0.027778

0.6 0.183333 0.166667 0.15 0.133333 0.116667 0.1 0.083333 0.066667

40

Table 21: OWA Operator Method and BK Method: Results

For each hospital and for all data combined, the results generally became very stable

around α = 0.8 and α = 0.7. This is even considering the fact that for α = 0.8, w7 = w8 = 0, which

means the 7th and 8

th ranking places were not even considered for the aggregation. Although the

votes for the 7th and 8

th places had no effect on the results for α = .8, the outcome is still

comparable with those where all ranking places contributed. The rankings for α = 1 are very

unreliable and that essentially corresponds to a majority rule. There are still some ties produced

as seen in Table 21 with SHWH, which may be due to our use of the minimax disparity approach

for determining the weights. This method turned out to be very simple to implement and interpret.

Location Optimism

level, α

Aggregated Ranking

SHWH 1 4 ~ 7 ≻ 2 ≻ 6 ≻ 1 ≻ 3 ~ 5 ~ 8

0.9 4 ≻ 2 ≻ 7 ≻ 1 ≻ 6 ≻ 3 ≻ 5 ≻ 8

0.8 4 ≻ 2 ≻ 1 ≻7 ≻ 6 ≻ 3 ≻ 5 ≻ 8

0.7 4 ≻ 2 ≻ 1 ≻ 7 ≻ 6 ≻ 3 ~ 5 ≻ 8

0.666666667 4 ≻ 2 ≻ 1 ≻ 7 ≻ 6 ≻ 3 ~ 5 ≻ 8

0.6 4 ≻ 2 ≻ 1 ≻ 7 ≻ 6 ≻ 3 ~ 5 ≻ 8

MNMC α

1 7 ≻ 1 ~ 6 ≻ 4 ≻ 2 ~ 3 ~ 5 ~ 8

0.9 1 ≻ 6 ≻ 7 ≻3 ≻ 4 ≻ 2 ≻ 5 ≻ 8

0.8 1 ≻ 6 ≻ 7 ≻3 ≻ 4 ≻ 2 ≻ 5 ≻ 8

0.7 1 ≻ 6 ≻ 7 ≻3 ≻ 4 ≻ 5 ≻ 2 ≻ 8

0.666666667 1 ≻ 6 ≻ 7 ≻3 ≻ 4 ≻ 5 ≻ 2 ≻ 8

0.6 1 ≻ 6 ≻ 7 ≻3 ≻ 4 ≻ 5 ≻ 2 ≻ 8

HMC α

1 1 ≻ 6 ≻ 2~ 3 ~ 4 ~ 5 ~ 8 ≻ 7

0.9 1 ≻ 2 ≻ 4 ≻ 5 ≻ 6 ≻ 7 ≻ 3 ≻ 8

0.8 1 ≻ 2 ≻ 4 ≻ 5 ≻ 7 ≻ 3 ≻ 6 ≻ 8

0.7 1 ≻ 2 ≻ 4 ≻ 5 ≻ 7 ≻ 3 ≻ 6 ≻ 8

0.666666667 1 ≻ 2 ≻ 4 ≻ 5 ≻ 7 ≻ 3 ≻ 6 ≻ 8

0.6 1 ≻ 2 ≻ 4 ≻ 5 ≻ 7 ≻ 3 ≻ 6 ≻ 8

All Data α

1 7 ≻ 1 ~ 4 ~ 6 ≻ 2 ≻ 3 ~ 5 ~ 8

0.9 1 ≻ 4 ≻ 2 ≻ 6 ≻ 7 ≻ 3 ≻ 5 ≻ 8

0.8 1 ≻ 4 ≻ 2 ≻ 6 ≻ 7 ≻ 3 ≻ 5 ≻ 8

0.7 1 ≻ 4 ≻ 2 ≻ 7 ≻ 6 ≻ 3 ≻ 5 ≻ 8

0.666666667 1 ≻ 4 ≻ 2 ≻ 7 ≻ 6 ≻ 3 ≻ 5 ≻ 8

0.6 1 ≻ 4 ≻ 2 ≻ 7 ≻ 6 ≻ 3 ≻ 5 ≻ 8

41

The Borda-Kendall method performs very well except for the SHWH data, due to the presence of

a tie in the aggregation.

4.3.3 Results of the Method of Three Mathematical Programs

In Table 22 are the results as determined by the method using three mathematical

programs.

Table 22: Three Mathematical Programs: Results

In applying this method, we conclude that the NLP is somewhat tedious because it must

be evaluated for each patient. It is clearly shown that evaluating all three methods is unnecessary

because the aggregated rankings are exactly the same for any given hospital except Hershey

Medical Center. But even this anomaly could be attributed to the fact that there were only ten

nurse prioritizations at that location, instead of a larger number. Otherwise, the method is very

easy to implement with the simple LPs.

Location MP Aggregated Ranking

SHWH LP-1 4 ≻ 2 ≻ 7 ≻ 1 ≻ 6 ≻ 3 ≻ 5 ≻ 8

LP-2 4 ≻ 2 ≻ 7 ≻ 1 ≻ 6 ≻ 3 ≻ 5 ≻ 8

NLP-1 4 ≻ 2 ≻ 7 ≻ 1 ≻ 6 ≻ 3 ≻ 5 ≻ 8

MNMC

LP-1 1 ≻ 6 ≻ 7 ≻ 4 ≻ 3 ≻ 2 ≻ 5 ≻ 8

LP-2 1 ≻ 6 ≻ 7 ≻ 4 ≻ 3 ≻ 2 ≻ 5 ≻ 8

NLP-1 1 ≻ 6 ≻ 7 ≻ 4 ≻ 3 ≻ 2 ≻ 5 ≻ 8

HMC

LP-1 1 ≻ 2 ≻ 6 ≻ 4 ≻ 5 ≻ 3 ≻ 7 ≻ 8

LP-2 1 ≻ 2 ≻ 4 ≻ 5 ≻ 6 ≻ 3 ≻ 7 ≻ 8

NLP-1 1 ≻ 2 ≻ 4 ≻ 5 ≻ 6 ≻ 3 ≻ 7 ≻ 8

All Data

LP-1 1 ≻ 4 ≻ 7 ≻ 6 ≻ 2 ≻ 3 ≻ 5 ≻ 8

LP-2 1 ≻ 4 ≻ 7 ≻ 6 ≻ 2 ≻ 3 ≻ 5 ≻ 8

NLP-1 1 ≻ 4 ≻ 7 ≻ 6 ≻ 2 ≻ 3 ≻ 5 ≻ 8

42

Although there is no characteristic to show accuracy in 2 out of the 4 methods applied

(Borda-Kendall and the method involving the three MPs), most of the results of these two exactly

match the result from the estimation of utility intervals method, with there being only one rank

reversal between the aggregations that did not match it. So, in summary, all methods gave similar

results, if only one hospital is being considered. There were few rank reversals across methods

for individual hospital results, but even fewer in the results from all data combined. Judging from

that, the most variability for patient priority came down to the 3rd

, 4th, and 5

th patients to be seen.

Thus, nurses are very much in agreement on who to be seen first, as well as who can wait the

longest before being seen. This phenomenon should be explored further in the future. Since there

is a lot of similarity seen in the results of the methods, a lot of emphasis is placed on the expert

opinion for an overall recommendation.

4.4 Expert Opinion

In a meeting with the expert, she was asked to perform the prioritization exercise as done

by the nurses in the study. The expert‘s patient prioritization is 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 2 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 4 ≻

𝑃𝑎𝑡𝑖𝑒𝑛𝑡 1 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 5 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 7 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 3 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 8 ≻ 𝑃𝑎𝑡𝑖𝑒𝑛𝑡 6 . Although the

expert‘s opinion was not directly illustrated in any of the aggregated rankings of the combined

data, she suggested that the utility interval method gave the most acceptable results and picked

this method as the best. She said it is the best at clearly communicating the results and thinks the

percentage of how much one is preferred over another is a very useful feature. We also note that

the footrule distance between her ranking and the ranking determined by this method is 16, which

is smaller than the smallest average distance obtained in the exploration of preference

discrepancies for comparisons across hospitals. This distance is 17.01 and was observed from the

SHWH vs. HMC data (See Figure 14). The difference between the expert‘s ranking and the

43

results from all methods is not surprising. She attributes this to her expertise in the primary care

area instead of emergency medicine. Commenting on the disparities seen across the methods‘

aggregations, the expert believes the few rank reversals are not important. Overall, the expert

believes the results show consistency.

The expert nurse based her decision for the best choice on the method and the results.

Although both were considered, the results were more heavily so. According to the expert, this is

because nurses like to see statistics. The method approved as second-best is the BK method, due

to its simplicity. Least favored are the three mathematical programs and the method which

determines OWA operator weights, with the reason being their complexity. The expert says that

what contributes most to the lack of appeal for these methods is no clear understanding of their

need to be so complex. She does agree that these methods allow for individual nurse variations to

be accounted for, though. Overall, these methods would not be the most advantageous in practice.

If the expert was a triage nurse in the ED and needed to look to any method on the job as a

decision-making aid to her own knowledge, she would prefer the method using the estimation of

utility intervals.

4.5 Preference Aggregation Method Recommendation

As a result of our application of utility theory toward the selection of a suitable

preference aggregation method, along with expert opinion on the quality of the results, we

recommend the method using estimation of utility intervals. In this method, it is easy to see how

the weights are obtained and the expert stresses that nurses performing the aggregation need to

have weights that make sense to them. Additionally, this method‘s provision of degrees of

preference is a very desirable quality that gained it a lot of the expert‘s support and interest.

44

Because of this, the method shows great representation of all nurse opinions. Also, the nurses are

believed to appreciate how clearly the results are communicated via this method.

CHAPTER FIVE

5 SUMMARY AND CONCLUSIONS

This thesis examines the decision-making area of nurse triage and prioritization in the

Emergency Department of a hospital. In this environment, crucial and potential life-altering

decisions are made by a skillful and caring staff of individuals who sometimes work long hours in

a fast-paced and ever-changing job, made so by those who arrive at their doors. There are a

number of aspects that contribute to the stress in that setting and complicate nurse decision-

making processes. Additionally, prior studies have shown that nurses draw upon their personal

experience, knowledge, and intuition in making judgments on the category a patient should be

triaged in and on the prioritization of those patients after they have been initially triaged. If

enough time has elapsed, initial prioritizations may need to be altered to accommodate the

changes in patient status, if any. Since nurses bring their individual judgments to their decision-

making, this work first shows that when given the same set of fictional patient data, discrepancies

do exist in patient prioritization. Thus, by selecting and recommending an appropriate preference

aggregation method for this situation, this work seeks to benefit the healthcare industry by

offering an instrument that could help increase productivity, whether it is implemented as a part

of a larger decision-support tool, or on its own.

Data was collected at three different hospitals in and around Central Pennsylvania. The

discrepancies revealed that the nurse‘s decisions differed significantly among those at the same

hospital and across those at other hospitals. The smallest average footrule distance within a

hospital is 13.9, while the smallest average footrule distance between any pair of the hospitals is

17. Also, with the majority of the data having low or medium correlations within and across

hospitals, this also shows some disassociation.

46

Since there is some uncertainty in triaging and prioritizing patients, as well as in defining

preferences and attributes for the five aggregation methods assessed, utility theory was used to

elicit DM preferences and evaluate which methods performed the best according to these

preferences. The main objectives to achieve were maximum representation of all nurse opinions

and maximum ease of use. The method with the highest utility of 0.990 was determined to be the

method which used OWA operator weights, and the 2nd

highest method was the method

employing the estimation of utility intervals, with a utility of 0.988.

Upon applying four of the methods to the ranking data collected, it was discovered that

for each hospital individually, the aggregation results were fairly similar, with the exception of a

few rank reversals, especially for SHWH. The OWA operator weights method with α = 1

performed the worst overall, which is expected because it is essentially an aggregation by

majority rule. The most stable methods were the method of the three mathematical programs and

the OWA operator weights method for α values between 0.8 and 0.6.

An expert opinion was also solicited to help evaluate the results of applying the

aggregation methods to the data and to obtain a suggestion from someone who would best know

what may realistically work well if incorporated in some way into these nurses‘ decision-making

environment. The expert‘s ranking of the patients was not reflected in any of the aggregation

results of the combined data, but had a fairly small footrule distance of 16 from the method that

estimates utility intervals. The methods which had designations of accuracy were well received

by the expert and her final suggestion was the method that estimates utility intervals. This thesis

recommends the method that estimates utility intervals as the most suitable preference

aggregation method due to the expert‘s opinion as well as considering the utility theory results.

The BK method is the second best recommendation.

Future work could be to expand this study to include more hospitals, nurses, methods, or

a more extensive or expansive scenario set. New preference aggregation methods could be

47

developed specifically for this type of decision-making scenario, as well. The final method

recommended here can be implemented by itself and employed under peak or particularly

stressful times if desired, or combined with a decision support tool for more impact. Regardless

of the next steps inspired as a result of this project, it is sure to add value to the US healthcare

industry through the increase of efficient practices.

REFERENCES

Andersson, A.K., Omberg, M. and Svedlund, M. (2006) ‗Triage in the Emergency Department—

a Qualitative Study of the Factors which Nurses consider when Making Decisions‘, Nursing

in Critical Care, 11(3), 136–145.

Arrow, K. J. (1951) Social Choice and Individual Values. New York: Wiley.

Beg, M. M. S. (2004) ‗Parallel Rank Aggregation for the World Wide Web‘, Proceedings of the

2nd

International Conference on Intelligent Sensing and Information Processing.

Beveridge R. (1998) ‗The Canadian Triage and Acuity Scale: A new and critical element in

health care reform‘, Journal of Emergency Medicine, 16(3), 507–511.

Beveridge, R., Ducharme, J., Janes, L., Beaulieu, S. and Walter, S. (1999) ‗Reliability of the

Canadian Emergency Department Triage and Acuity Scale: Interrater Agreement‘, Annals of

Emergency Medicine, 34(2), 155-159.

Black D. (1958) The Theory of Ccommittees and Elections. Cambridge: Cambridge University

Press.

Borda JC. (1784) Mémoire sur les élections au scrutin. Histoire de l‘Académie Royale de

Science; Paris: (Translated in the political theory of Condorcet. Sommerlad F., Mclean I.

Social studies. Working paper 1/89, Oxford, 1989).

Bogart, K. P. (1973) ‗Preference Structures I: Distances between Transitive Preference

Relations‘, Journal of Mathematical Sociology, 3, 49-67.

Bogart, K. P. (1975) ‗Preference Structures II: Distances between Asymmetric Relations‘, SIAM

Journal on Applied Mathematics, 29(2), 254-262.

49

Buesching, D. P., Jablonowski, A., Vesta, E., Dilts, W., Runge, C., Lund, J. and Porter, R. (1985)

‗Inappropriate Emergency Department Visits‘, Annals of Emergency Medicine, 14 (7), 672–

676.

Cioffi, J.(1998) ‗Decision Making by Emergency Nurses in Triage Assessments‘, Accident and

Emergency Nursing, 6(4),184-191.

Cone, K. J., and Murray, R. (2002) ‗Characteristics, Insights, Decision Making, and Preparation

of ED triage nurses‘, Journal of Emergency Nursing, 28(5), 401–406.

Considine, J., Botti, M., and Thomas, S. (2007) ‗Do Knowledge and Experience Have Specific

Roles in Triage Decision-making?‘, Academic Emergency Medicine, 14, 722-726.

Constanze, P., Cnossen, F., and A. Ballast (2005) ‗More than Psychologists‘ Chitchat: The

Importance of Cognitive Modeling for Requirements Engineering in Complex and Dynamic

Environments‘, Proceedings of SREP’05, 163-175.

Cook, W. D., and Kress, M. (1990) ‗A Data Envelopment Model for Aggregating Preference

Rankings‘, Management Science, 36(11), 1302-1310.

Cook, W. D., and Kress, M., and Seiford, L. M. (1996) ‗A General Framework for Distance-

Based Consensus in Ordinal Ranking Models‘, European Journal of Operational Research,

96, 392-397.

Cook, W. D., and Seiford, L. M. (1978) ‗Priority Ranking and Consensus Formation‘,

Management Science, 24(1), 1721-1732.

Fields, E., Claudio, D., Okudan, G.E., Smith, C., and Freivalds, A. (2009) ‗Triage Decision

Making: Discrepancies in assigning the Emergency Severity Index‘, Proceedings of the 2009

Industrial Engineering Research Conference, (in press).

50

Gilboy, N., Tanabe, P., Travers, D.A., Rosenau, A.M. and Eitel, D.R. (2005) Emergency Severity

Index, Version 4: Implementation Handbook, AHRQ Publication No. 05-0046-2. Rockville,

MD: Agency for Healthcare Research and Quality.

Görannson, K. E., Ehrenberg, A., Marklund, B., and Ehnfors, M. (2006) ‗Emergency Department

Triage: Is there a link between nurses‘ personal characteristics and accuracy in triage

decisions?‘, Accident and Emergency Nursing, 14, 83-88.

Grossman, V.G.A. (1999) Quick Reference to Triage. Lippincott. Philadelphia, PA.

Gurney, D. (2004) ‗Exercises in Critical Thinking at Triage: Prioritizing Patients with Similar

Acuities‘, Journal of Emergency Nursing, 30(5), 514-516.

Hazelrigg, G. A. (1996) ‗Systems Engineering: a new Framework for Engineering Design‘,

ASME Dynamic Systems and Control Division, 60, 39-46.

Inada, K. (1964) ‗A Note on the Simple Majority Rule‘, Econometrica, 32(4), 525-531.

Keeney, R. L., and Raiffa, H., (1993) Decisions with Multiple Objectives: Preferences and Value

Tradeoffs, Cambridge University Press, Cambridge, UK.

Kemeny, J. G., and Snell, L. J. (1962) ‗Preference Ranking: an Axiomatic Approach‘, In:

Mathematical Models in the Social Sciences, Ginn, New York, 9-23.

Kendall M. (1962) Rank correction methods, 3rd ed. New York: Hafner.

Lehmann, E. L., and D‘Abrera, H. J. M. (1998) Nonparametrics: Statistical Methods besed on

Ranks. Prentice Hall, Upper Saddle River, NJ.

Liu, X., (2009) ‗Parameterized Defuzzification with continuous weighted Quasi-arithmetic Means

– An extension‘, Information Sciences, 179(8), 1193-1206.

Öztürk, M., and Tsoukiàs, A. (2008) ‗Bipolar Preference Modeling and Aggregation in Decision

Support‘, International Journal of Intelligent Systems, 23, 970-984.

51

Patel, V.L., Gutnik, L.A., Karlin, D.R. and Pusic, M.(2007) ‗Calibrating urgency: Triage

Decision Making in a Pediatric Emergency Department‘, Advances in Health Science

Education (Epub ahead of print- http://www.springerlink.com /content/y467460

754k15077/).

Peneva, V., and Popchev, I. (2007) ‗Aggregation of Fuzzy Preference Relations to Multicriteria

Decision Making‘, Fuzzy Optimization and Decision Making, 6, 351-365.

Press Release WHO/44. ‗World Health Organization assess the world's health system‘, June

2000. http://www.who.int/whr/2000/media_centre/press_release/en/index.html Retrieved on

January 18, 2008.

Reid, P.P. (2005) Building a Better Delivery Ssystem: a new engineering/health care partnership,

National Academies Press. Washington DC.

Raiffa, H. and Keeney, R. L. (1976) Decisions with Multiple Attributes: Preferences and Value

Tradeoffs, Wiley and Sons, New York.

Tamiz, M., and Foroughi, A. A. (2007) ‗An Enhanced Approach to the ranked voting system‘,

World Review of Entreprenuership, Management, and Sustainable Development, 3(3-4), 365-

372.

Tanabe P, Gimbel R, Yarnold PR, Kyriacou, D.N., and Adams, J.G.(2004) ‗Reliability and

Validity of Scores on the Emergency Severity Index Version 3‘, Academic Emergency

Medicine, 11, 59-65.

Travers D.A., Waller, A.E., Bowling J.M., Flowers, D., and Tintinalli, J.(2002) ‗Five-level Triage

System more effective than Three-level in Tertiary Emergency Department‘, Journal of

Emergency Nursing, 28, 395-400.

Wang J. W., Chang, J. R., Cheng, C. H. (2006) ‗Flexible Fuzzy OWA querying method for

Hemodialysis Database‘, Soft Computing, 10 (11), 1031-1042.

52

Wang, Y. M., Chin, K. S., and Yang, J. B. (2007a) ‗Three new models for Preference Voting and

Aggregation‘, Journal of the Operational Research Society, 58, 1389-1393.

Wang, Y. M., Luo, Y., and Hua, Z. (2007b) ‗Aggregating Preference Rankings using OWA

operator weights‘, Information Sciences, 177, 3356-3363.

Wang, Y. M., and Parkan, C. (2005) ‗A Minimax Disparity Approach for obtaining OWA

operator weights‘, Information Sciences, 175, 20-29.

Wang, Y. M., Yang, J. B., Xu, D. L. (2005) ‗A Preference Aggregation method through the

Estimation of Utility Intervals‘, Computers & Operations Research, 32, 2027-2049.

Yager, R. (1988) ‗On Ordered Weighted Averaging Aggregation Operators in Multi-Criteria

Decision Making‘, IEEE Transactions on Systems, Man, and Cybernetics, 18, 183-190.

Zimmermann, P. G. (2001) ‗The Case for a universal, valid, reliable 5-tier Triage Acuity Scale

for US Emergency Departments‘, Journal of Emergency Nursing, 27(3), 246–254.

Appendix A: Interview Protocol

Overall goal of the research

The overall goal of our team is to develop effective triage decision aids for use in the Emergency

Department (ED). As a foundation for this work, we need to understand the current triage

process, how the emergency department staff would like to see it improved, contextual factors

that complicate the decision-making, etc. This interview is not to evaluate the performance of the

nursing staff, rather understand how the triage system, in general, can be improved. We feel that

with the increasing population and the changing trends in healthcare (increasing frequency of ED

visits, etc.), there is a strong need in providing solutions for improving productivity.

Main objectives of the meeting

1. To understand the triage process being implemented at the location the interviewee works,

2. To learn about the current state wide/Federal regulations/guidelines for the triage process,

3. What are the (as perceived by the interviewee) relative importance levels for vital signs in

determining the Emergency Severity Index (ESI) category of a patient? Which sign(s) does

the interview feel is (are) more important to monitor?

4. If ESI is not used, what is the sorting procedure during triage? How does it integrate vital

sign readings?

Questions:

A. Background of the interviewee:

1. What is your current position title at this agency? State Licensure titles? (e.g., LPN, RN,

etc.)? Current national certifications?

2. Could you provide your brief:

a. educational history,

b. work experience (position title, approximate start/stop date, location, state),

including how long you have been here in the ED.

c. on the job training,

3. What continuing education programs have prepared you for your current position?

54

4. What led you become an ED nurse? Is there anything that motivates you to do a good

job?

B. Factors making the triage decision-making complex:

5. What are the factors/issues you encounter daily that can complicate decision-making

during triage?

a. Stress level of staff,

b. Number of hours at work,

c. Number of patients in the ED,

d. Dealing with differences in opinions about Emergency Severity Index (ESI)

categorization (also assumes ESI is what they are using).

e. Different opinion/judgment about ESI categorization of a patient?

f. In your opinion, which of the four decision points in the ESI triage process have

the most potential for human error:

i. Is the patient dying?

ii. Is this a patient who shouldn‘t wait?

iii. How many resources will this patient need?

iv. What are the patient‘s vital signs?

g. What are other potential points in the process that have a high likelihood of

human error? (Reading vital numbers incorrectly, placing patients in incorrect

categories, etc.)

C. Triage:

6. Have you worked in the triage role before? Is the triage process the same everywhere?

Please explain.

7. How does one‘s knowledge on triage transfer from one location to another? How did

your past knowledge prepare you for the current role, and what areas of preparation were

different in this role than in others?

8. Is the 5 point ESI the current norm?

a. If yes, could you explain the overall process?

b. For ESI 3-5, what is the policy for re-triage? Re-triage interval (30 min)?

c. If no, could you explain the process, and compare it to ESI, if possible?

55

9. Do you have patient categories that change interpretation of vital signs? (e.g., are norms

different based on age of patient or other factors.)

10. How do you make the decision of who needs to be seen next if the patients next in line

are of the same acuity level? How do you feel with this policy? Between 1-5, where 1 is

poor and 5 is excellent, how would you classify the fairness of this policy?

11. If the policy is FCFS, why do you think this is a good (or bad) policy?

12. What is your opinion on the relationship between time and the severity of a condition?

13. Is there a procedure to reassess a patient that has been waiting after some time? Do you

do it all the time?

14. How much of your decision is based on the patient‘s vital signs?

15. Of the part that is not based on the vital signs, what other attributes are considered?

D. Critical incidences:

16. Do you recall any patient care incident where the results could have significantly

improved if a comprehensive triage had preceded the incident?

17. If a comprehensive triage system couldn‘t precede it, what situational characteristic,

nurse behavior or characteristic, or hospital policy or characteristic could have prevented

the incident?

E. Triage decision-making questions:

18. Out of the vital signs you measure, which one(s) do you consider more important to

measure and monitor?

19. In descending order of importance how do you classify the vital signs?

20. Do you believe the vitals you mentioned are independent? How?

21. Ask them to prioritize 8 sets of patient vital data.

56

Suppose a nurse has taken the temperature, pulse, respiration rate, and blood pressure

measurements for eight patients. Given the following information from the table, how would

you categorize and rank them in order of priority?

Participant # _______

Patient

#

Gender

Age

Temperature

(° F)

Pulse

(beats

per

min.)

Respiration

Rate

(breaths

per min.)

Systolic

Blood

Pressure

(mm Hg)

Diastolic

Blood

Pressure

(mm Hg)

ESI

Rank

1 M 18 101.8 97 26 125 79

2 M 40 98.0 110 20 150 92

3 F 25 99.1 94 28 120 80

4 F 7 98.3 115 18 145 90

5 F 33 101.2 75 23 130 85

6 M 24 97.8 92 29 120 81

7 F 3 100.7 80 25 128 83

8 M 55 97.8 80 18 125 96

F. Last Question:

22. Is there anything about triage decision-making at this hospital that we have not asked you

about that you feel we should know, as it could be beneficial in helping us reach our

goals?

57

Appendix B: Assessment of Alternatives

The assessment below follows this structure:

Name of Method

o Objective 1) Maximize representation of all opinions

Accuracy

Improvement

Flexibility

o Objective 2) Maximize ease of use

Time/Effort

Complexity

Implementation

----------------------------------------------------------

1) Estimation of Utility intervals (Wang et al., 2005)

O1)

a) Degrees of preference. In final ranking, this gives how well the higher ranked

alternative(s) is over the lower ranked alternative(s).

b)

(1) Proposes a range of utilities can represent a rank position, not just 1 value

(2) View of ordinal rank NOT as a constraint on utility

(3) Concerning the aggregation of intervals, method is simpler and more practical by

relaxing assumptions and constraints on intervals, so it can be more widely used

[Kundu, Segupta and Pal don‘t]

c) Can handle lots of data with ties and/or incomplete rankings by DMs. If parameters

are chosen, may choose final answer based on degree of preference desired.

O2)

a) May take some time to figure out how to set up LPs based on the data you have.

Then seemingly minimal computational time

b) 6

(1) For each ranking structure (i.e. A > B > C vs. C > A > B), count up how many

DMs decided on it as their decision.

(2) Use pair of LPs to find utility intervals for each alternative for each ranking

structure (may need to select parameters here)

(3) For each alternative, find weighted average utility by taking weighted sum

(normalized weight of the ranking place * utility interval for that ranking place)

(4) Ranking of alternatives by their weighted average utility interval

(a) Calculate matrix of degrees of preference

(b) Calculate matrix of preference relations

(c) Sum each row of matrix in (b), alternative with highest sum is ranked 1st

58

c) Basic LP solver. If your data has one of the 4 order structures, maybe less comp. time

b/c may use formulas instead of solve LPs.

2) Rank Aggregation for WWW (Beg, 2004)

O1)

a) Minimize normalized footrule distance among all alternatives

b)

(1) Borda method is positional and doesn‘t always satisfy Condorcet Criterion

(2) Optimizes a distance criterion

c) Can satisfy partial lists of data, can specify number of generations based on the

amount of time you desire to wait for the solution

O2)

a) Much effort to get data into GA and for parallel processing. Long computational

time; however, the longer the time, the better the results

b)

(1) Get partial lists in response to a query (from different search engines)

(2) Represent chromosomes in decimal

(3) Input data into model, Run GA to get aggregated list, which is the list that gives

the minimum distance, upon normalizing all footrule distances between the

aggregated list and each partial list (may need to be run in a parallel manner)

c) Not made clear in paper, but seems rather challenging

3) OWA operator weights (Wang et al., 2007)

O1)

a) Optimism level allows DMs to see a variety of final rankings, and choose the one that

emphasizes the ranking places as they like. Higher levels emphasizes 1st place or

higher level places and lower levels have more dispersed weights among all places

Results are more representative of all/majority of DMs if there is a high optimism

level.

b)

(1) Produces objective weights [BK doesn‘t]

(2) Produces stable solution (winner and the full ranking of all other alternatives)

[not necessarily DEA methods, due to choice of discriminating factor]

(3) Only needs 1 parameter (DM optimism level) which is easy to find

(4) Properties of OWA weights make them good for use in preference aggregation

c) Can work when only a few ranking places are considered

O2)

a) Not much. Maybe a little computational processing time if you use an NLP instead of

the LP to solve for weights

b) 3+ (depending on how many α values being looked at)

(1) For each alternative, calculate how many DMs ranked it for each ranking place.

Let optimism level, α є (0.5, 1].

(2) Determine OWA weights of the ranking places, under different α values, using 1

of the 4 suggested models

59

(3) Score each candidate for each α value, by taking the weighted sum. Highest sum

is the winner.

c) Excel and Excel solver can be used, or LINGO

4) Three math programming models (Wang et al., 2007)

O1)

a) Not really. For some examples all three models come out with same normalized

relative weights, so they give the same results. Also, may come out with same result

as BK method.

b)

(1) Produces objective weights [BK doesn‘t]

(2) Produces stable solution (winner and the full ranking of all other alternatives)

[not necessarily DEA methods, due to choice of discriminating factor]

(3) No parameters are needed, no need for experiencing difficulty in choosing

discriminating factor [same as (2)]

(4) Do not need to know the total number of voters [same as (2)]

(5) Strong overall capabilities in choosing winner and other rankings

c) Data can only be of first few ranking places and solution can be generated from that

O2)

a) Not specified, seems to be quick/reasonable computational time

b) 3

(1) For each alternative, calculate how many DMs ranked it for each ranking place

(2) Input data in model and solve for weights

(3) Do weighted sum to find overall rankings. Highest sum is winner.

c) May be implemented easily in Excel and solved on Excel Solver. Can also use

LINGO.

5) Borda Kendall (BK) Method (Borda, 1784; Black, 1958; Kendall, 1962)

O1)

a) No

b)

(1) Widely used

(2) Computational simplicity

(3) Produces satisfactory results

c) All DMs must be decisive and preferences must be specified (i.e. no one can be

indifferent to a rank level or alternative, no ties allowed)

O2)

a) Could be tedious and long if there are many alternatives, may also be true if there are

many DMs. Must organize data and do calculations in Excel to reduce time (but may

still take some time depending on how the data is set up in Excel).

b) 2

(1) For each alternative, calculate how many DMs ranked it for each ranking place

(2) Weights are already determined, so do weighted sum to find overall ranking.

Highest sum is winner.

c) Excel or pen, paper & calculator

60

Appendix C: Remaining Footrule and Correlation Data

Table 23: Footrule Data Set 2 - Pairwise Comparisons among all MNMC Nurses

M: Nurse

1

M: Nurse

2

M: Nurse

3

M: Nurse

4

M: Nurse

5

M: Nurse

6

M: Nurse

7

M: Nurse

8

M: Nurse

9

M: Nurse

10

M: Nurse

11

M: Nurse

12

M: Nurse 1 0 22 20 22 12 14 8 14 10 12 6 14

M: Nurse 2 0 28 14 12 14 26 30 20 30 24 24

M: Nurse 3 0 24 26 26 20 16 22 16 22 16

M: Nurse 4 0 18 20 24 26 28 32 24 20

M: Nurse 5 0 2 18 24 10 20 12 18

M: Nurse 6 0 20 24 10 20 12 20

M: Nurse 7 0 10 12 12 14 6

M: Nurse 8 0 16 8 12 12

M: Nurse 9 0 12 10 18

M: Nurse 10 0 8 18

M: Nurse 11 0 20

M: Nurse 12 0

Table 24: Footrule Data Set 3 - Pairwise Comparisons among all HMC Nurses

H: Nurse

1

H: Nurse

2

H: Nurse

3

H: Nurse

4

H: Nurse

5

H: Nurse

6

H: Nurse

7

H: Nurse

8

H: Nurse

9

H: Nurse

10

H: Nurse 1 0 22 10 20 8 12 16 22 22 18

H: Nurse 2 0 26 30 22 18 20 16 22 28

H: Nurse 3 0 14 8 10 20 22 20 12

H: Nurse 4 0 16 20 24 20 18 18

H: Nurse 5 0 6 20 20 16 18

H: Nurse 6 0 20 18 16 18

H: Nurse 7 0 22 16 10

H: Nurse 8 0 14 20

H: Nurse 9 0 14

H: Nurse 10 0

61

Table 25: Footrule Data Set 4 - Pairwise Comparisons between SHWH and MNMC Nurses

M: Nurse

1

M: Nurse

2

M: Nurse

3

M: Nurse

4

M: Nurse

5

M: Nurse

6

M: Nurse

7

M: Nurse

8

M: Nurse

9

M: Nurse

10

M: Nurse

11

M: Nurse

12

S: Nurse 1 30 14 16 18 26 26 28 26 30 26 30 22

S: Nurse 2 22 18 16 18 20 20 18 20 22 22 22 12

S: Nurse 3 16 24 16 32 14 14 14 16 8 8 16 16

S: Nurse 4 22 30 14 22 30 30 18 10 22 18 22 18

S: Nurse 5 12 22 18 14 16 18 10 12 16 20 14 10

S: Nurse 6 18 24 18 20 22 22 12 10 18 14 16 10

S: Nurse 7 10 26 18 24 20 20 14 6 12 10 8 16

S: Nurse 8 22 10 18 20 16 16 22 22 16 24 22 18

S: Nurse 9 18 16 18 16 18 20 14 20 22 22 22 10

S: Nurse 10 18 22 14 16 20 22 14 18 18 20 22 10

S: Nurse 11 26 12 18 16 16 16 26 22 20 24 26 22

S: Nurse 12 18 18 16 22 16 16 18 14 12 16 20 16

S: Nurse 13 10 24 14 16 16 16 16 12 16 16 10 14

S: Nurse 14 22 18 16 18 20 20 18 20 22 22 22 12

Table 26: Footrule Data Set 5 - Pairwise Comparisons between SHWH and HMC Nurses

H: Nurse

1

H: Nurse

2

H: Nurse

3

H: Nurse

4

H: Nurse

5

H: Nurse

6

H: Nurse

7

H: Nurse

8

H: Nurse

9

H: Nurse

10

S: Nurse 1 20 10 22 24 16 14 22 10 18 28

S: Nurse 2 14 18 14 14 8 10 18 12 10 18

S: Nurse 3 16 22 20 24 24 24 10 18 18 14

S: Nurse 4 30 26 24 16 26 28 24 16 16 18

S: Nurse 5 20 30 14 8 16 20 20 22 14 12

S: Nurse 6 22 20 18 18 16 18 8 20 12 10

S: Nurse 7 24 30 22 12 20 24 14 18 16 10

S: Nurse 8 12 16 20 18 12 12 16 14 14 22

S: Nurse 9 10 20 10 14 4 8 18 16 12 16

S: Nurse 10 20 24 18 18 14 14 16 16 2 12

S: Nurse 11 14 18 20 20 14 16 16 18 18 26

S: Nurse 12 18 20 24 20 18 22 8 18 14 18

S: Nurse 13 22 30 20 8 18 22 20 18 12 16

S: Nurse 14 14 18 14 14 8 10 18 12 10 18

62

Table 27: Footrule Data Set 6 - Pairwise Comparisons between MNMC and HMC Nurses

H: Nurse

1

H: Nurse

2

H: Nurse

3

H: Nurse

4

H: Nurse

5

H: Nurse

6

H: Nurse

7

H: Nurse

8

H: Nurse

9

H: Nurse

10

M: Nurse 1 16 30 16 16 18 22 14 22 20 10

M: Nurse 2 12 16 20 22 14 12 20 22 24 26

M: Nurse 3 24 20 22 18 22 22 20 10 12 20

M: Nurse 4 22 24 20 16 16 18 26 26 18 24

M: Nurse 5 8 28 16 18 16 18 16 28 22 18

M: Nurse 6 10 28 16 18 16 18 16 28 22 20

M: Nurse 7 16 28 8 16 16 18 12 22 16 4

M: Nurse 8 24 28 18 12 20 24 14 18 18 10

M: Nurse 9 14 28 20 20 22 24 10 22 18 10

M: Nurse 10 20 26 20 20 24 24 10 18 20 10

M: Nurse 11 18 30 20 14 22 24 14 24 22 12

M: Nurse 12 16 24 10 14 12 16 16 18 10 10

Table 28: Correlation Data Set 2 - Pairwise Comparisons among all SHWH Nurses

S: Nurse 1 S: Nurse 2 S: Nurse 3 S: Nurse 4 S: Nurse 5 S: Nurse 6 S: Nurse 7 S: Nurse 8 S: Nurse 9 S: Nurse 10 S: Nurse 11 S: Nurse 12 S: Nurse 13 S: Nurse 14

S: Nurse 1 0.738 -0.048 -0.095 -0.167 0.143 -0.548 0.571 0.500 0.190 0.667 0.262 -0.262 0.738

S: Nurse 2 0.167 0.071 0.381 0.500 -0.048 0.667 0.905 0.619 0.571 0.405 0.214 1.000

S: Nurse 3 -0.238 -0.167 0.333 0.310 0.238 0.024 0.143 0.238 0.548 -0.024 0.167

S: Nurse 4 0.548 0.310 0.595 -0.286 -0.024 0.429 -0.381 -0.024 0.548 0.071

S: Nurse 5 0.548 0.714 0.143 0.571 0.667 0.048 0.286 0.857 0.381

S: Nurse 6 0.452 0.024 0.405 0.643 0.262 0.571 0.214 0.500

S: Nurse 7 -0.095 0.048 0.476 -0.190 0.381 0.810 -0.048

S: Nurse 8 0.714 0.476 0.810 0.619 0.286 0.667

S: Nurse 9 0.667 0.524 0.357 0.405 0.905

S: Nurse 10 0.286 0.595 0.548 0.619

S: Nurse 11 0.762 0.048 0.571

S: Nurse 12 0.333 0.405

S: Nurse 13 0.214

S: Nurse 14

Hi Corr 11 12%

Med Corr 43 47%

Low Corr 37 41%

Total 91 Comparisons

Table 29: Correlation Data Set 3 - Pairwise Comparisons among all MNMC Nurses

M: Nurse

1

M: Nurse

2

M: Nurse

3

M: Nurse

4

M: Nurse

5

M: Nurse

6

M: Nurse

7

M: Nurse

8

M: Nurse

9

M: Nurse

10

M: Nurse

11

M: Nurse

12

M: Nurse 1 0.024 -0.190 0.048 0.667 0.548 0.690 0.524 0.762 0.548 0.905 0.452

M: Nurse 2 -0.286 0.452 0.595 0.571 -0.310 -0.690 0.095 -0.524 -0.214 -0.095

M: Nurse 3 -0.286 -0.310 -0.262 -0.024 0.476 0.048 0.238 -0.119 0.190

M: Nurse 4 0.095 0.071 0.000 -0.190 -0.333 -0.690 -0.214 0.214

M: Nurse 5 0.976 0.190 -0.071 0.690 0.262 0.619 0.048

M: Nurse 6 0.071 -0.095 0.595 0.238 0.571 -0.071

M: Nurse 7 0.690 0.452 0.595 0.571 0.881

M: Nurse 8 0.333 0.690 0.619 0.548

M: Nurse 9 0.714 0.738 0.238

M: Nurse 10 0.738 0.286

M: Nurse 11 0.214

M: Nurse 12

Hi Corr 7 11%

Med Corr 28 42%

Low Corr 31 47%

Total 66 Comparisons

63

Table 30: Correlation Data Set 4 - Pairwise Comparisons among all HMC Nurses

H: Nurse 1 H: Nurse 2 H: Nurse 3 H: Nurse 4 H: Nurse 5 H: Nurse 6 H: Nurse 7 H: Nurse 8 H: Nurse 9 H: Nurse 10

H: Nurse 1 -0.095 0.643 0.167 0.762 0.738 0.286 0.167 0.071 0.143

H: Nurse 2 -0.357 -0.667 -0.119 0.119 -0.262 0.381 0.000 -0.643

H: Nurse 3 0.571 0.833 0.810 0.000 0.143 0.167 0.381

H: Nurse 4 0.452 0.262 -0.190 0.238 0.238 0.357

H: Nurse 5 0.929 0.095 0.214 0.310 0.143

H: Nurse 6 -0.071 0.238 0.095 -0.095

H: Nurse 7 -0.048 0.500 0.643

H: Nurse 8 0.595 0.048

H: Nurse 9 0.595

H: Nurse 10

Hi Corr 5 11%

Med Corr 14 31%

Low Corr 26 58%

Total 45 Comparisons

Table 31: Correlation Data Set 5 - Pairwise Comparisons between SHWH and MNMC Nurses

H: Nurse 1 H: Nurse 2 H: Nurse 3 H: Nurse 4 H: Nurse 5 H: Nurse 6 H: Nurse 7 H: Nurse 8 H: Nurse 9 H: Nurse 10

S: Nurse 1 0.190 0.690 0.167 -0.167 0.429 0.595 -0.048 0.595 0.310 -0.381

S: Nurse 2 0.524 0.167 0.643 0.310 0.786 0.762 0.238 0.667 0.667 0.238

S: Nurse 3 0.429 -0.167 0.024 -0.167 -0.048 -0.071 0.762 0.214 0.214 0.476

S: Nurse 4 -0.619 -0.286 -0.071 0.548 -0.238 -0.381 -0.143 0.357 0.476 0.357

S: Nurse 5 0.095 -0.667 0.476 0.810 0.500 0.190 0.190 0.071 0.571 0.595

S: Nurse 6 0.024 -0.333 0.310 0.143 0.286 0.119 0.738 0.048 0.690 0.738

S: Nurse 7 -0.095 -0.810 0.048 0.619 -0.071 -0.381 0.429 0.024 0.429 0.738

S: Nurse 8 0.667 0.310 0.190 0.095 0.619 0.548 0.238 0.571 0.452 -0.119

S: Nurse 9 0.667 -0.024 0.738 0.452 0.929 0.810 0.190 0.476 0.619 0.286

S: Nurse 10 0.143 -0.095 0.214 0.310 0.381 0.119 0.476 0.500 0.976 0.619

S: Nurse 11 0.476 0.310 0.048 -0.143 0.524 0.500 0.429 0.238 0.310 -0.214

S: Nurse 12 0.262 -0.048 -0.143 -0.048 0.214 0.024 0.810 0.238 0.619 0.310

S: Nurse 13 0.119 -0.643 0.238 0.857 0.310 0.000 0.095 0.238 0.452 0.429

S: Nurse 14 0.524 0.167 0.643 0.310 0.786 0.762 0.238 0.667 0.667 0.238

Hi Corr 16 11%

Med Corr 63 45%

Low Corr 61 44%

Total 140 Comparisons

64

Table 32: Correlation Data Set 6 - Pairwise Comparisons between MNMC and HMC Nurses

H: Nurse 1 H: Nurse 2 H: Nurse 3 H: Nurse 4 H: Nurse 5 H: Nurse 6 H: Nurse 7 H: Nurse 8 H: Nurse 9 H: Nurse 10

M: Nurse 1 0.381 -0.833 0.310 0.524 0.238 -0.071 0.452 -0.214 0.238 0.714

M: Nurse 2 0.714 0.333 0.167 -0.143 0.595 0.595 0.024 0.048 -0.048 -0.405

M: Nurse 3 -0.190 0.095 -0.119 0.286 -0.048 -0.071 0.167 0.738 0.524 0.095

M: Nurse 4 0.048 0.024 0.190 0.167 0.524 0.381 -0.119 -0.238 0.214 -0.143

M: Nurse 5 0.762 -0.452 0.286 0.143 0.452 0.310 0.476 -0.310 -0.119 0.190

M: Nurse 6 0.714 -0.429 0.262 0.119 0.429 0.333 0.429 -0.357 -0.238 0.071

M: Nurse 7 0.262 -0.643 0.619 0.476 0.357 0.143 0.476 0.048 0.548 0.952

M: Nurse 8 -0.238 -0.738 0.214 0.643 -0.048 -0.262 0.333 0.119 0.452 0.762

M: Nurse 9 0.429 -0.476 -0.024 0.048 0.000 -0.214 0.786 -0.024 0.286 0.619

M: Nurse 10 0.095 -0.619 0.119 0.190 -0.214 -0.310 0.595 0.000 0.143 0.738

M: Nurse 11 0.214 -0.929 0.190 0.476 0.000 -0.238 0.429 -0.357 -0.024 0.643

M: Nurse 12 0.333 -0.333 0.690 0.500 0.595 0.405 0.381 0.381 0.786 0.786

Hi Corr 14 12%

Med Corr 48 40%

Low Corr 58 48%

Total 120 Comparisons