Upload
dangkhuong
View
225
Download
0
Embed Size (px)
Citation preview
Data Envelopment Analysis models for a mixture ofnon-ratio and ratio variables
written by
Sanaz Sigaroudi
A report submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Graduate Department of Mechanical and Industrial EngineeringUniversity of Toronto
Copyright c© 2016 by Sanaz Sigaroudi
Abstract
Data Envelopment Analysis models for a mixture of non-ratio and ratio variables
Sanaz Sigaroudi
Doctor of Philosophy
Graduate Department of Mechanical and Industrial Engineering
University of Toronto
2016
Performance comparison is a delicate business, even among organizations of the same
kind. The simplest of all is usually the ratio of a single output to a single input. The
problem lies in the fact that one aspect of the business could hardly represent the whole
picture and the landscape the business is operating in. Businesses have complex struc-
tures and offer variety of products so it is only fair to take all into consideration to judge
their performance against others in an industry. Data Envelopment Analysis (DEA) is
one method suitable when there are multiple inputs and outputs to be considered. It is
a non-parametric method conceptualized by Farrell in 1957. However, it was not untill
20 years later, that Charnes, Cooper and Rhodes brought this concept into practice by
finding a way to realize this idea and make it work. The breakthrough came from the
fact that under certain assumptions Farrell’s idea could be formulated as a linear math-
ematical program (LP) which could be solved using the simplex and similar methods.
One limitation of the existing DEA models is their inability to work with ratio variables
because the linear combination of DMUs do not generally translate to linear combina-
tion of inputs and outputs in the ratio form. In this work, our contribution to the field
includes extending Farrell’s idea to include ratio inputs and outputs and operationalizing
four models under variable returns to scale assumption. Three non-oriented models are
formulated and linearized and one non-linear model is solved using a heuristic.
ii
Acknowledgements
Thanks God for the greatest gift of being, for the opportunities and the wonderful people I
have come across in life and work with. Among those people, first and foremost, I would
like to express my sincere gratitude to my supervisor, Professor Joseph C. Paradi for
supporting me with his knowledge, patience, and sincerity. This PhD journey has had
many dimensions beyond the academic aspect and it has been a life experience which
made it worthwhile. He has been so kind to let me follow my love and life outside the
country, and enjoy my time as a new mom. Professor Paradi’s devotion to the wellbeing
of his students is exceptional and exemplary.
I am grateful to my committee members Professor Y. Lawryshyn, Professor R. Kwon
for providing me with constructive comments and insightful feedback, as well as Professor
C. Lee, and in particular Professor E. Thanassoulis for serving on my defense committee.
Professor Thanassoulis comments and feedback have greatly helped us to improve the
work.
I would also like to thank the staff at the MIE graduate office, mostly Brenda Fung
for helping me out through the administrative parts of the process. I would also like
to acknowledge the Department of Mechanical and Industrial Engineering and Rotman
Business School, the Graduate Management Consulting Association and Environmental
management Committee, who gave me the opportunity to get involved and enhance my
personal and professional development beyond standard education. I would also like to
thank my friends at the Centre for Management of Technology and Entrepreneurship,
present and alumni for their friendship and help.
My life in Canada has been a rich experience due to the friends I have made, the ones
I feel they have been with me all through my life. I have to specially thank my relatives
Solmaz, Makhmal and Anoosh who have offered me their home, care and love during my
frequent visits to Toronto.
I also like to thank Professor R. Thorpe who trusted me and gave me a place and
iii
the opportunity to work on my research in the Leeds University Business School, where
my life took me. I also like to thank Professor K. Pandza for giving me the flexibility to
finish my PhD while working.
I wish to extend my personal thanks to my family. To my parents for their uncondi-
tional love, my brother and sister-in-law, for being there for me. I am also blessed with
loving and caring parents-in-law who keep me in their good prayers. My sisters-in-law
and their families have been always there to cheer for me and support me. Also a per-
sonal tribute to my best friend’s mom, who is not with us anymore, but always believed
in me and encouraged me to get my PhD, I hope she is watching this from heavens.
Most importantly, thanks to my husband, Mohsen, for all he has been for me, for all
he means to me, and all he will be. For his understanding, love and encouragement along
this long process. Thanks to my beautiful daughter, Nika, and soon to come son, Iliya for
bringing hope, happiness and joy to my life. Thanks for your patience and cooperation,
enduring my absence and long working days and nights. I am truly blessed.
iv
Contents
1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Thesis Objectives and Contributions . . . . . . . . . . . . . . . . . . . . 6
2 Data Envelopment Analysis: Theory, Assumptions and Realization
Techniques 8
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 DEA Basic assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Production Possibility Set . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Efficiency Definition . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.4 Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 DEA basic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 CCR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 BCC Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.3 Additive model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Deterministic Frontier Estimators . . . . . . . . . . . . . . . . . . . . . . 19
2.4.1 Free Disposal Hull Frontier . . . . . . . . . . . . . . . . . . . . . . 20
2.4.2 Variable Returns to Scale Frontiers . . . . . . . . . . . . . . . . . 20
v
2.4.3 Constant Returns to Scale Frontier . . . . . . . . . . . . . . . . . 21
2.5 Probabilistic Frontier Estimators . . . . . . . . . . . . . . . . . . . . . . 21
2.5.1 Partial m Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.2 Quantile Frontier . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.3 Practical Frontiers . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6 Linkage between Data Envelopment Analysis and Ratio Analysis . . . . . 26
2.6.1 Comparing DEA and RA . . . . . . . . . . . . . . . . . . . . . . . 26
2.6.2 Combining DEA and RA . . . . . . . . . . . . . . . . . . . . . . . 27
3 Literature review of non-oriented models 30
3.1 Russell Graph Efficiency Model . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Refined Russell Graph Efficiency Model . . . . . . . . . . . . . . . . . . . 34
3.3 Multiplicative Model (log measure) . . . . . . . . . . . . . . . . . . . . . 36
3.4 Invariant Multiplicative Model . . . . . . . . . . . . . . . . . . . . . . . . 37
3.5 Pareto efficiency test model (Additive) . . . . . . . . . . . . . . . . . . . 40
3.6 Extended Additive model . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 Constant Weighted Additive Model . . . . . . . . . . . . . . . . . . . . . 45
3.8 Normalized Weighted Additive Model . . . . . . . . . . . . . . . . . . . . 46
3.9 Global Efficiency Measure (GEM) . . . . . . . . . . . . . . . . . . . . . . 47
3.10 Enhanced Russell Graph Efficiency Measure (enhanced GEM) . . . . . . 48
3.11 Range Adjusted Model (RAM) . . . . . . . . . . . . . . . . . . . . . . . 51
3.12 BAM: a bounded adjusted measure . . . . . . . . . . . . . . . . . . . . . 52
3.13 Slack-based Measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.14 Directional slack-based measure and distance function . . . . . . . . . . . 55
3.15 Graph Hyperbolic measure of efficiency . . . . . . . . . . . . . . . . . . . 57
3.16 Benefit function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.17 Range Directional Model and Inverse Range Directional Model . . . . . . 59
3.18 Modified Slack-based Measure . . . . . . . . . . . . . . . . . . . . . . . . 62
vi
3.19 Directional distance functions and slack-based measures of efficiency . . . 64
3.20 Universal model for ranking . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.21 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4 Literature review of approximation models 67
4.1 Bootstrapping and DEA . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 Sampling techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5 Methodology:Proposed Non-oriented Model 75
5.1 Required adjustments to the basics of DEA . . . . . . . . . . . . . . . . . 75
5.1.1 Defining PPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.1.2 Disposability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.1.3 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.4 Identifying the efficient units . . . . . . . . . . . . . . . . . . . . . 78
5.1.5 Calculating the relative efficiency score . . . . . . . . . . . . . . . 78
5.2 Building the right measure of Efficiency . . . . . . . . . . . . . . . . . . 80
5.2.1 Proposed non-oriented model . . . . . . . . . . . . . . . . . . . . 81
5.2.2 Model in the making . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.2.3 Making sense of the inefficiency score . . . . . . . . . . . . . . . . 97
6 Methodology: Approximating the Frontier in BBC Model 99
6.1 Partial Improvement: Approximation methods . . . . . . . . . . . . . . . 101
6.1.1 How to generate PPS progressively . . . . . . . . . . . . . . . . . 102
6.1.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.1.3 Pseudo Monte Carlo method . . . . . . . . . . . . . . . . . . . . . 106
6.1.4 Keep or discard, an LP feasibility problem . . . . . . . . . . . . . 109
6.1.5 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.2 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
vii
7 Realization, Case Study and Results 119
7.1 Realization of the non-oriented model using MATLAB . . . . . . . . . . 120
7.2 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2.1 Bank branch data: choice of model, inputs and outputs . . . . . . 122
7.2.2 Comparing the proposed model against traditional additive model 124
7.3 Case study, nonlinear BCC Model: approximation method . . . . . . . . 127
8 Recommendations and future work 131
8.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
8.2 Discussion of the Results: Proposed model . . . . . . . . . . . . . . . . . 132
8.2.1 Efficiency Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8.2.2 Direction of improvement . . . . . . . . . . . . . . . . . . . . . . 133
8.3 Discussion of the results: approximation method . . . . . . . . . . . . . . 134
8.4 Recommendation, limitations and future directions . . . . . . . . . . . . 135
References 143
viii
List of Tables
1.1 Lasik Equipment Information . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Comprehensive Lasik Equipment Information . . . . . . . . . . . . . . . 5
5.1 Non-oriented models and their properties . . . . . . . . . . . . . . . . . . 83
5.2 Summary of models and desired properties . . . . . . . . . . . . . . . . . 85
7.1 Input and output variables, rev=revenue and res=resources . . . . . . . . 124
7.2 Missed potential on savings at input side . . . . . . . . . . . . . . . . . . 126
7.3 Missed opportunity for higher return on revenue . . . . . . . . . . . . . . 126
8.1 Further savings on inputs (million $) . . . . . . . . . . . . . . . . . . . . 134
ix
List of Figures
1.1 Difference between the facets generated, based on correct PPS estimator
(black) and conventional DEA (red) with ratio variables . . . . . . . . . 6
2.1 DEA basic models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Shapes of conventional frontiers: FDH, BCC, and CCR . . . . . . . . . . 21
5.1 The blue print to construct a non-oriented DEA model . . . . . . . . . . 86
6.1 Average non-zero weighs of size p vs Resolution . . . . . . . . . . . . . . 103
6.2 Sparse Matrix: Average number of non-zero weights vs Resolution . . . . 105
6.3 Number of iterations grows exponentially with smaller resolution when the
number of DMUs increases. . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.4 for same completeness ratio, larger sample size wins . . . . . . . . . . . . 113
6.5 Sample size affect is small if the number of hypothetical DMUs generated
stays the same . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.6 Increasing the number of unobserved DMUs, Sample Size 5 . . . . . . . . 115
6.7 The same number of unobserved DMUs but different constructs, Sample
Size 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.1 Comparing efficiency scores against traditional model . . . . . . . . . . . 125
7.2 drop/raise in the efficiency score of branches after adding unobserved
DMUs generated by the approximation method . . . . . . . . . . . . . . 128
x
7.3 Efficiency score of the unobserved DMUs generated by the approximation
method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
xi
Chapter 1
Introduction
Performance comparison is a delicate business, even among organizations of the same
kind. The simplest of all is usually the ratio of a single output to a single input. The
problem lies in the fact that one aspect of the business could hardly represent the whole
picture and the landscape the business is operating in. Businesses have complex struc-
tures and offer variety of products so it is only fair to take all into consideration to
judge their performance against others in the industry. Data Envelopment Analysis
(DEA) is one method suitable when there are multiple inputs and outputs to be con-
sidered. It is a non-parametric method conceptualized by Farrel in 1957 [Farr 57]. One
limitation of the existing DEA models is their inability to work with ratio variables
[Holl 03, Emro 09, Siga 09]. In this work, our contribution to the field includes extend-
ing Farrel’s idea to include ratio inputs and outputs and operationalizing two models:
non-oriented additive model and a variable returns to scale (VRS) model with ratio
variables at the side of orientation. For the latter we operationalized an existing concept.
This chapter defines the problem we aim to solve and provide some background infor-
mation around it. This chapter is structured as follows: first we provide some background
information then define the problem we aim to solve and, at the end, the thesis objectives
and our contributions to the field are listed.
1
Chapter 1. Introduction 2
1.1 Background
Ratio analysis(RA) is an easy-to-understand and straightforward method to measure
relative efficiency on a single aspect. When we talk about the term “efficiency”, the
ratio of a single output to a single input, such as return on assets, may come to mind.
Relative efficiency then is defined by dividing the aforesaid ratio by the corresponding
“best performer’s” efficiency. Best performer is the one unit with maximum or minimum
ratio value depending on what is desirable in the problem. Ratio variables, simple and
straightforward, mask some of the information by nature. Looking on only one ratio
might be misleading, two stocks with the same return on equity might be very different
in the amount of equity they hold, their profit margin, the amount they can borrow and
their future earnings. Despite its limitations ratio analysis has been thus far the preferred
method in industry [Kriv 08].
Data Envelopment Analysis (DEA) is a method to measure overall relative perfor-
mance of every unit in a group with multiple inputs and outputs; units are usually referred
to as Decision Making Units or, in short, DMUs. The definition of performance is not
unique and may depend on the certain issues that need to be addressed in an industry
or business. DMUs are characterized by their input and output variables and can, for
instance, represent manufacturing sites, bank branches, hospitals, schools or even human
beings. What makes this method so distinctive is its holistic approach to consider all
the inputs and outputs simultaneously, in contrast with other methods such as indexing
or ratio analysis. In 1957 Farrell [Farr 57] produced an activity analysis approach (the
methodology behind DEA) to correct what he believed were deficiencies in commonly
used index number approaches to productivity (and similar) measurements. However,
it was not untill 20 years later, that Charnes, Cooper and Rhodes brought this concept
into practice by finding a way to realize this idea and make it work [Char 78]. The
breakthrough came from the fact that under certain assumptions Farrell’s idea could be
formulated as a linear mathematical program (LP) which could be solved using the sim-
Chapter 1. Introduction 3
plex and similar methods. Despite DEA’s popularity among academics, it is not widely
used as a practical tool for performance assessment in industry. It seems that the major
hurdle in the DEA deployment is its different language in communicating the results to
managers. It is expected that the language that involves RA in explaining DEA results
would make it more understandable to management and, as a result, would be more
appealing to the industry. This new language may require the use of desired ratios as
the inputs and/or outputs of the DMUs.
Over the years, DEA gained acceptance and became popular as an efficiency mea-
surement tool. Scientists and a very few practitioners from different fields started to use
it. Some attempted to combine RA and DEA, overlooking the assumptions that made
this idea a computational reality in the first place. One of these common mistakes is
to feed the ratio variables in either inputs or outputs to the original Charnes, Cooper
and Rhodes (CCR) model, or other existing models in the DEA literature, which would
inevitably result in distorted outcomes. For example imagine three job-shops, for which
the headquarters have provided training classes for operating a new machine. Job-shop
A has 4 staff members on the machine and after training they were able to generate 8
products (productivity=2) with the highest quality 5. Job-shop B has 5 staff and they
have produced 15 units (productivity=3) of quality 4. Job-shop C with 6 staff produced
24 units (productivity=4) with quality of 3. If the productivity and quality are the
two output metrics that job-shops are evaluated against, and the input (for example
equipment) be the same for all, the traditional DEA will report all the job-shops equally
efficient. However note that the job-shop B is not as efficient as the average of job-shops
A and C combined with 5 staff who produce 16 units (1 more than 15) with the same
quality ratings as job-shop B. A preliminary study on finding such optimal solutions was
done in [Siga 09]. This doctoral thesis stems from that initial work and goes beyond that
to form a new branch in the DEA and ratios.
The problem with ratio variables in the DEA has been raised a few times by scholars.
Chapter 1. Introduction 4
In 2001, Dyson et al. [RGDy 01] listed several shortcomings of the DEA approach in use
and among them was the problem of mixing index, ratio data with volumes. In 2003,
Hollingsworth et al. [Holl 03], pointed out the problem of using ratios as inputs or outputs
in one DEA model (CCR). Later on, in 2008, Emrouznejad et al. [Emro 09] highlighted
the convexity axiom violation when using ratios in inputs or outputs in another DEA
model (BCC) and proposed a modified model when ratios may be present. The model
is only applicable to DEA models with specific orientation (input or output) where the
ratio variables do not exist on the orientation side. Moreover, the presented model was
conceptual rather than computational since the existing commercial software packages
did not handle ratios. In 2009, in our work leading to this thesis [Siga 09], we augmented
the model in [Emro 09] by adding a second phase to it and made that computationally
feasible and illustrated it with MATLAB coded example. We also proposed an additive
model, which did not have either the limitations of the required orientation or the one-
side only requirement of ratio variables. Our model, in its original form, did not however
provide a comprehensive measure of efficiency. It’s score was not bounded by unity and
it was not units invariant. Hence, it was highly sensitive to measurement errors. Our
goal in this thesis is to develop a model based on the Farrell’s idea that can be applied to
DMUs with ratio variables and, even more importantly, make that idea a computational
reality. We study the linkage between the RA and DEA in Chapter 2.
1.2 Problem Statement
Including ratio variables “as is” in the well-known DEA models, may lead to incorrect
results. One of the fundamental assumptions in DEA is that the production possibility
set (PPS), which consists of all feasible DMUs (observed and unobserved), could be con-
structed by a convex combination of the observed DMUs. This assumption is jeopardized
when ratio variables are involved and, as a result, the true best practice is missed. The
Chapter 1. Introduction 5
Table 1.1: Lasik Equipment Information
Per season Branch A Branch B Branch C
Number of Returning customers 30 24 96
Technical Staff hours 600 300 800
Number of units sold 130 144 10
Sales Staff hours 650 1200 200
Commercial Expenses 100 70 110
Table 1.2: Comprehensive Lasik Equipment Information
output indicator output indicator input indicator
Customer Satisfaction : Revenue generation: Cost
Number of returning customersTechnical Staff hours
Number of units soldSales Staff hours
Branch A 5% 20% 100
Branch B 8% 12% 70
Branch C 12% 5% 110
problem was extensively studied in [Siga 09] and here, I borrow an example from there
to explain the problem.
A company active in selling laser equipment has received the data in Table 1.1 regard-
ing its branches, which has then to be rearranged in the managerial form of choice for
better communication. In practice, management prefers to see the information in ratio
form, as shown in Table 1.2. Here, we show in Figure 1.1 how the existing DEA mod-
els, without consideration for ratio variables, results in an incorrect frontier and might
result in missing some of the potential for improvement. In this thesis, we create models
that can correctly use ratios, hence, define the right, feasible production units and the
benchmark. Here, we need to comment on what Cook et al. has recently published
[Cook 14], they pointed out that not every use of ratios in a model imposes a potential
problem and it is too restrictive to reach the conclusion that two forms of data (ratio
Chapter 1. Introduction 6
Figure 1.1: Difference between the facets generated, based on correct PPS estimator
(black) and conventional DEA (red) with ratio variables
and normal) cannot coexist in a model. It is true that the outcome depends on the ratio
and how it is generated. For example when inputs and outputs are created using the
same denominator (for instance, all inputs have been divided by the largest input), an
easy transformation like multiplication by a constant would make ratios into a normal
variable. In such cases well-known DEA models can still be used, with no adjustments.
The general advice is: whenever possible, try to replace a ratio variable with a proxy
measure (e.g. instead of a poverty index, use the number of people seeking jobs or are
on benefits) but when that is not an option, we have a solution. Here, we focus on the
general case of using ratios where such transformations are not available.
1.3 Thesis Objectives and Contributions
The major goal in this thesis is to develop a mechanism for correctly incorporating ratio
variables into DEA. We develop a robust non-oriented model, which is units and trans-
lation invariant. It provides a comprehensive measure of efficiency bounded by zero and
one, which is easy to explain to management. We also propose a method to overcome
nonlinearity for a radially oriented model (BCC). We then convert this conceptual model
into a computational reality by transforming it into an LP. We use approximation meth-
Chapter 1. Introduction 7
ods to develop a semi Monte Carlo mechanism for solving the nonlinear cases. On this
journey, we had made a catalogue of all developed non-oriented models as well as a pro-
cedure to create a non-oriented model. For our purposes we also make a list of desired
properties for a DEA model and compare existing models with ours, based on these. Our
models are tested on a case study to create a visual showcase and demonstrate the tan-
gible benefits. We believe the techniques presented are helpful for researchers trying to
develop DEA models for specific industries which are traditionally more inclined towards
using ratio variables.
This thesis is structured as follows:
• In Chapter 2, we cover the DEA concept, assumptions and realization methods.
• Chapters 3 and 4 collectively provide the literature review required for this work.
Chapter 3 is dedicated to the non-oriented DEA models. Chapter 4 presents infor-
mation on the approximation methods in connection with the DEA.
• Chapters 5 and 6 outline the proposed methodologies in this work. Chapter 5 deals
with creating and realizing a new non-oriented DEA model, which supports the
use of ratios. Chapter 6 outlines the methodology of the approximation method
proposed for the special case of BCC with ratio variables.
• In Chapter 7, models are compared on a small but real set of data from 132 branches
of a major Canadian bank.
• Chapter 8 summarizes the contributions and comments on the future prospects of
this work.
Chapter 2
Data Envelopment Analysis:
Theory, Assumptions and
Realization Techniques
2.1 Introduction
In his seminal econometric work [Farr 57], Farrell proposed an activity analysis approach
to correct what he believed were deficiencies in the commonly used index number ap-
proaches to productivity (and alike) measurements. His main concern was to generate an
overall measure of efficiency that accounts for the measurements of multiple inputs and
outputs. The concept was materialized over twenty years later. In 1978, Charnes, Cooper
and Rhodes (CCR) [Char 78] generalized Farrell’s work and formulated it in a mathe-
matical form. Charnes et al. [Char 78] described DEA as a “mathematical programming
model applied to observational data to provide a new way of obtaining empirical estimates
of relationships — such as production functions and/or efficient production possibility
surfaces — that are the cornerstones of modern economics” [Coop 04].
DEA is a “data oriented” approach for evaluating the performance of a set of peer
8
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 9
entities called Decision Making Units (DMUs), which provides a single efficiency score
while simultaneously considering multiple inputs and multiple outputs. It is essential
that all DMUs have the same operational and cultural environment, otherwise they are
not comparable, e.g. we cannot compare bank branches with grocery stores. Because
it requires very few assumptions, DEA has opened up possibilities for use in cases that
have been resistant to other approaches because of the complex (often unknown) nature
of the relationships between the multiple inputs and multiple outputs involved in the
operation of the DMUs.
Formally, DEA is a methodology directed to frontiers rather than central tendencies.
Instead of trying to fit a regression plane through the data, as in statistical regression, for
example, one “floats” a piecewise linear surface to rest on top of the observations. Because
of this perspective, DEA proves to be particularly adept at uncovering relationships that
remain hidden in other methodologies [Coop 04].
Researchers in a number of fields have recognized that DEA is an excellent method-
ology for modeling operational processes. Its empirical orientation and minimization of
a-priori assumptions have resulted in its use in a number of studies involving efficient
frontier estimation in the nonprofit sector, in the regulated sector, and in the private
sector. DEA encompasses a variety of applications in evaluating the performances of
different kinds of entities such as hospitals, universities, cities, courts, business firms,
and banks, among others. According to the latest bibliography available [Emro 08], over
4000 papers were published on DEA by 2007. Our recent search using Google Scholar
showed that the number of publications between 2008 and 2015 with “Data Envelopment
Analysis” in the title is above 3000. In total, the subject has generated enough interest
among scholars to write more than 7000 papers in peer reviewed journals. Such a rapid
growth and widespread acceptance of the methodology of DEA are testimonies to its
strengths and perceived applicability by academics.
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 10
2.2 DEA Basic assumptions
We have talked about the history of DEA, the motivation behind it and the flexibility
it offers compared to or used in conjunction with statistical methods. Here, we review
the principles and assumptions around various DEA models that we may refer to in this
work.
2.2.1 Production Possibility Set
In the productivity analysis, or efficiency measurement in general, when the DMUs
consume s different inputs to produce m different outputs, the production possibil-
ity set is the collection of all feasible DMUs that are capable of producing output
Y = (y1, y2, ..., ym) by consuming input X = (x1, x2, ..., xs). The PPS is defined as
the set:
Ψ =
(X, Y ) ∈ Rm+s‖X can produce Y
(2.1)
As mentioned in section 2.1, DEA is very data oriented. This means that we build the
production possibility set, based on observed data points and some assumptions, which
in some aspects, relates to our model. We briefly introduce some of the assumptions used
but leave the detailed evaluation and the choice of the appropriate model for the next
chapter.
Free disposability axiom: A fundamental assumption to form the PPS out of the
available data is “disposability”. If X can produce Y so does any X‘ ≥ X and if Y
could be produced by X so could be any Y ′ ≤ Y . Formally, each observed set of data
X = (x1, ..., xm), Y = (y1, ..., ys) brings along part of the unobserved piece of the PPS
which is defined as:
Ψ′ ⊆ Ψ =
(X ′, Y ′) ∈ Rm+s|X ′ ≥ X and Y ′ ≤ Y
This is like saying, if DMUi could be realized, then any DMU that is doing worse is
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 11
feasible, too. This assumption leads to the Free Disposal Hull (FDH) model [Depr 84],
which shares its PPS with many of the other models.
Convexity: Any convex linear combination of realized DMUs is feasible. In other words,
if two DMUs are in the PPS, so is the line connecting them (or any linear combination
of them). More generally, this holds for the linear combination of n DMUs defined
by: DMUcomposite = ∑n
i=1 λi · DMUi|∑n
i=1 λi = 1. This assumption leads to the BCC
model, a variable returns to scale model which will be explained later in 2.3.2.
Ray Unboundedness: Scaling up or down of any realized DMU generates a new feasible
DMU. ∀ DMUi ∈ PPS and γ ≥ 0, γ · DMUi ∈ PPS. This assumption, added to the
convexity assumption, is the basis of CCR, a constant returns to scale model which we
will visit later in 2.3.1.
2.2.2 Frontier
Once we generate the desired PPS, set Ψ in (2.2.3), then it is time to define the potential
benchmarks or the frontier. The frontier is composed of one or more estimated lines or
surfaces (depending on dimensions) enveloping only but no less than the whole PPS. It is
the line/hyperplane that separates the feasible DMUs from infeasible ones. We might be
interested in certain facets of the frontier, depending on our intention to reduce inputs
or augment outputs. The projection to the frontier may be based on an input or output
facet (segment) of the frontier or a combination of both, defined as follows:
∂ΨX = Y |(X, Y ) ∈ Ψ, (X, η · Y ) /∈ Ψ,∀η > 1 ,
∂ΨY = X|(X, Y ) ∈ Ψ, (θ ·X, Y ) /∈ Ψ, 0 < θ < 1 .
Figure 2.1 shows different frontiers based on disposability, convexity and ray unbound-
edness assumptions about PPS.
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 12
FDH Frontier
E
A
B
D
Output Slack
Input Slack
Input
CCRFrontier
BCCFrontier
XX
X
T1
T3
T2
Figure 2.1: DEA basic models
2.2.3 Efficiency Definition
What do we mean by “efficiency”, or more generally, by saying that one DMU is more
efficient than another DMU? Relative efficiency in DEA provides us with the following
definition, which has the advantage of avoiding the need for assigning a-priori measures
of relative importance to any input or output.
Full Efficiency: Full efficiency is attained by any DMU if, and only if, none of its inputs
or outputs can be improved without worsening some of its other inputs or outputs. In
most management or social science applications, the theoretically possible levels of effi-
ciency will not be known. The preceding definition is therefore replaced by emphasizing
its uses with only the information that is empirically available, as in the following defi-
nition.
Full Relative Efficiency (Pareto efficiency): In DEA we speak of relative efficiency,
because we compare the DMU against a set of reference peers. Full efficiency is attained
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 13
by any DMU if, and only if, compared to other observed DMUs, under certain assump-
tions relevant to the case such as control over inputs and/or outputs, it is not possible to
reduce the amount of any inputs and/or attain more of any outputs without using more
of at least one another input and/or reducing the levels of at least one another output.
A DMU is pareto-efficient if it bears no slacks/shortfalls in any of its inputs/outputs.
Mathematically if the production possibility set, the collection of all feasible DMUs with
output Y = (y1, y2, ..., ym) and input X = (x1, x2, ..., xs) be:
Ψ =
(X, Y ) ∈ Rm+s‖X can produce Y
DMUk is pareto-efficient if there exist no DMUj, j 6= k in PPS such that Yj ≥ Yk while
Xj ≤ Xk. Note: In compact form Yj ≥ Yk means yij > yik for some i ∈ 1...m and
yi′j ≥ yi′k for the rest, i′ ∈ 1...m 6= i, the same principal applies to X.
Technical efficiency: Assuming that the inputs or outputs can only contract or expand
radially, input technical efficiency of a unit is defined as the maximum proportion that
any of its inputs can contract without making other inputs infeasible and/or worsening
any of the outputs. Similarly output technical efficiency is the maximum proportion
any of the outputs can expand without using more of any inputs or making the unit
infeasible. So the individual inputs or outputs may carry slacks/shortfalls even though
they are technically efficient. Mathematically let’s assume the PPS is convex then the
technical input efficiency for DMUk with input and output vectors, Xk and Yk as de-
noted above, is θ∗ = minθ θ : (θ ·Xk, Yk) ∈ Ψ, 0 < θ and technical output efficiency is
η∗ = maxη (Xk, η · Yk) ∈ Ψ, η > 1 Note: η · Yk in compact orm means (η · y1k, ...η · ysk),
the same is for θ ·X.
Technical change: Technical change is the relative efficiency of the entity when com-
pared to a broader or newer peer groups over time. It represents the difference of the
organization’s environment and technology adoption or, technically speaking, the bench-
mark shift [Grif 99]. In light of new technology in certain industry, the frontier could
shift and DMUk would appear less efficient in the new settings. In Figure 2.1 the green
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 14
frontier shows the frontir shift. The technical efficiency can be measured by dividing the
new efficiency score by the old score.
Scale efficiency: Banker et al. [Bank 84] identify the difference between the “variable
returns to scale” model, BCC, and the “constant returns to scale” model, CCR, as a
production scale effect. Scale efficiency represents the failure in achieving the most pro-
ductive scale size, reflected by the score difference between CCR and BCC models. It is
computed as the CCR efficiency score,θ∗CCR divided by the BCC efficiency score:θ∗BCC . In
Figure 2.1 for DMU D, the scale efficiency captures the distance between “T2” and “T3”.
This means that under VRS assumptions point “T2” is already using the best practice
and because of economies of scale it is impossible to achieve the same productivity as
“T3” (which equals the productivity of A and B).
Input slack factor: For every DMU on the frontier, the input slack factor for input xi
addresses the unused capacity of that input meaning that the input xi could have been
further reduced while staying technically efficient. Input slack factor of one indicates there
is no slack for input xi. Mathematically for any DMU on the frontier (X, Y ) : X ∈ ∂ΨY
the input slack factor for xi is defined as min γi : (x1, ..., γi · xi, ..., xm) ∈ ΨY . It is evi-
dent that (1− γ∗i · xi is the slack for input xi.
Input substitution factor: For a DMU on the frontier where the input xi carries no
slack, input substitution factor identifies the lowest level of input xi that is feasible at
the cost of increasing at least one other output. Mathematically for any DMU on the
frontier with no slack in xi, (X, Y ) : X ∈ ∂ΨY , γ∗i = 1 the input substitution factor for
xi is defined as min κi : (κ1 · x1, ..., κi · xi, ..., κm cotxm) ∈ ΨY , κj > 0j = 1, ...,m. It is
the least amount of input i being able to produce output Y in the PPS. So there would
exist no other DMU with lower input i (no matter what the rest of inputs are, to produce
output Y .
Output Slack factor: For every DMU on the frontier, the output slack factor for
output yi addresses the unmet potential of that output meaning that the output yi
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 15
could have been further expanded while staying technically efficient. Output slack
factor of one indicates there is no shortfall for output yi. Mathematically for any
DMU on the frontier (X, Y ) : Y ∈ ∂ΨX the output slack factor for yi is defined as
max γi : (y1, ..., γi · yi, ..., ys) ∈ ΨX. It is evident that (γ∗i − 1) · yi is the shortfall for
output yi.
Output substitution factor: It is the maximum amount of output i achievable con-
suming X in the PPS. So there would exist no other DMU consuming X to generate
higher output i (no matter what the rest of outputs are).
2.2.4 Orientation
DMUs are represented by their inputs and outputs. Efficiency scores depend on how far
the DMU is located from the frontier. Depending on the problem, DMUs can reduce
their inputs or increase their outputs, or target improvements in inputs and outputs,
simultaneously, in order to move to a point on the frontier. The models that focus on
minimizing inputs are called input oriented and the models that focus on maximizing
outputs are called output oriented. There are models with the goal of minimizing inputs
and maximizing outputs simultaneously, they are called non-oriented models.
2.3 DEA basic models
Depending on how one defines the PPS, the frontier and how to measure the distance,
there are several models that can be used. Each model has its applications and is suitable
for certain subject areas or cases. As there are dozens of special DEA models, we will
only describe the basic CCR, BCC, and input and output orientation for both. What
is also worth paying attention to is that defining a model which is theoretically sound
but cannot be operationalized is hardly of any use in practical applications. The models
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 16
that have gained traction and were put into use were those with which the scholars also
provided a guide or solution on how to realize them.
2.3.1 CCR Model
In the CCR model [Char 78],[Char 81], which is named after its developers, Charnes,
Cooper and Rhodes, the PPS is based on ray unboundedness and disposability assump-
tion. The authors simply generalize the ratio efficiency for a one-input, one-output case
to include multiple inputs and outputs. They reduce the nonlinear form to a linear
model. The model can be either input or output oriented and it mainly deals with scale
efficiency. They scale all the inputs down or outputs up to achieve a better efficiency. A
given DMUk has a relative efficiency, θk, defined as the maximum of ratio of the weighted
sum of s outputs, yk = (y1k, ..., ysk, to the m inputs, xk = (x1k, ..., xmk), in other words
for each DMU a virtual aggregated output and input is formed and is maximized.
θk =u1y1k + ...+ ukyskv1x1k + ...+ vmxmk
.
The input and output weights, (v1, ..., vm) and (u1, ..., us) respectively, are not fixed and
are chosen to show DMUk under the best possible light. Weight and efficiency score are
formulated in the following non-linear optimization for DMUk:
max θk =
∑si=1 uiyik∑mi=1 vixik
,
s.t.∑si=1 uiyij∑mi=1 vixij
≤ 1, j = 1, ..., n,
ui, vi ≥ 0.
To make this computationally viable, first the above fractional form was linearized by
assuming∑m
i=1 vixij = 1 and adding it to the constraints. The dual of the resulted LP is
then used. The dual of the above is transformed into the following LP form, introducing
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 17
weights for each DMU, λj, this is known as output oriented CCR:
min θk (2.2)
s.t.
n∑j=1
λjxij ≤ θkxik, i = 1, ...,m, (2.3)
n∑j=1
λjyij ≥ yik i = 1, ..., s, (2.4)
λj ≥ 0. (2.5)
If in the original form, instead of maximizing the virtual output to the virtual input,
we choose to minimize the inverse, the resulting dual LP will be as the following which
is known as the output oriented version. For the CCR model the input efficiency score
would be the inverse of the output efficiency score.
max ηk (2.6)
s.t.
n∑j=1
λjxij ≤ xik, i = 1, ...,m, (2.7)
n∑j=1
λjyij ≥ ηkyik i = 1, ..., s, (2.8)
λj ≥ 0. (2.9)
2.3.2 BCC Model
In the BCC model [Bank 84], the PPS assumptions are more restrictive and the convexity
postulate replaces the ray unboundedness but the rest are the same as the CCR model.
As a result, the efficiency score in the BCC is usually higher than that of the CCR.
Almost the same technique is used to reduce the problem to a linear form, the dual LP
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 18
for input oriented model is given by:
min θk (2.10)
s.t.
n∑j=1
λjxij ≤ θkxik, i = 1, ...,m, (2.11)
n∑j=1
λjyij ≥ yik i = 1, ..., s, (2.12)
n∑j=1
λj = 1, λj ≥ 0. (2.13)
With the aid of the simplex method and advances in computer and mathematical algo-
rithms, the LP forms became widespread. The output-oriented model would be the same
as CCR with the additional convexity constraint,∑n
j=1 λj = 1. Although the frontier
will not be affected by the input or output orientation, the inefficient DMUs will have
different efficiency scores in BCC model, depending on the orientation, because they
target different parts of the frontier.
2.3.3 Additive model
While the CCR and BCC models are either focused on minimizing inputs (input oriented)
or maximizing outputs (output oriented), the additive model focuses on decreasing inputs
(eliminating input slacks, s−i ) and increasing outputs (eliminating output shortfalls, s−i )
simultaneously and therefore, has no orientation. The original additive model[Char 85]
was formulated for VRS case and shares the same PPS with the BCC model, through out
this work we will refer to VRS additive model in general. However additive model can
also be formulated under CRS assumption by eliminating the convexity constraint as was
done in [Ali 93]. In Figure 2.1, point “E” is an inefficient DMU and in input orientation
BCC model, “E” needs to reduce inputs and some output slacks to reach “A”, and
the same analogy is true for output orientation which leads “E” to “G”. However, in
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 19
the Additive model, point “B” is optimum because reaching that point requires overall
maximum cuts in waste and shortfalls. The LP for additive model is given by:
maxλ,s±i
m∑i=1
si− +
s∑i=1
si+ (2.14)
s.t.
n∑j=1
λj · xij − xik + si− = 0 i = 1, ...,m (2.15)
n∑j=1
λj · yij − yik − si+ = 0 i = 1, ..., s (2.16)
n∑j=1
λj = 1 (2.17)
λj ≥ 0 (2.18)
2.4 Deterministic Frontier Estimators
All DMUs belong to the subspace between the origin and the frontier or the frontier and
infinity depending on output or input orientation, respectively. The concept of a frontier
is more general and easier to understand than the concept of a “production function”,
which has been regarded as a fundamental concept in economics. The frontier concept
allows each DMU to be seen under the best possible light contrary to production function
that remains the same for every DMU. Based on assumptions about the production e.g.
VRS, CRS, and the scope for improvements e.g. control over input/output, a benchmark
is constructed and a role model on that frontier is identified for every DMU. It is then vital
to get that benchmark right because otherwise, we would set an improper objective for the
DMU under study. Hence, we might either underestimate or overestimate the efficiency
score and, as a result, give an unrealistic projection. Number of basic assumptions in
deterministic frontier estimations is explained below.
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 20
2.4.1 Free Disposal Hull Frontier
The free disposal hull (FDH) assumption adds the unobserved production points with
output levels equal to or lower than those of some observed points, and with at least
one improved input; or, with input levels equal to or higher than those of some observed
points, and at least one improved output compared to the observed production data
[Depr 84]. In other words, if X generates Y , then more X can still generate Y , and X
can generate less Y , too. FDH is assumed, in the literature, to be sufficient to induce a
reference set that has all the properties that the economic theory requires of a production
set [Tulk 93]. However, strong disposability assumptions exclude congestion, which is
frequently observed, e.g. in agriculture and transportation, and undesired outputs (or
inputs), e.g. in oil production. As a simple example, if 100 trucks can deliver goods along
a specific route, within a certain time, then 1000 trucks (more input) might not necessarily
perform at the same level because the entire route might not have the capacity to handle
1000 trucks. Assessment of congestion analysis within DEA and ways to deal with it could
be found in a work by Fare et al. [Fare 83a], Brockett et al. [Broc 98] and Cherchye
et al. [Cher 01]. For more information on the undesired output and disposability issue,
refer to Yang and Pollitt’s work on environmental efficiency [Yang 07]. The FDH frontier
looks like a staircase for the one-input, one-output case, as seen in Figure 2.2.
2.4.2 Variable Returns to Scale Frontiers
The convexity assumption adds any non-observed data, which is a convex combination of
some points in the FDH, to the PPS. Although there are notable arguments and evidence
favoring convexity, some researchers have found this axiom very restrictive and proposed
to drop or weaken it. For a complete study on this issue, see Cherchye et al. [Cher 99].
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 21
x
y
x
CCR
BCC FDH
Figure 2.2: Shapes of conventional frontiers: FDH, BCC, and CCR
2.4.3 Constant Returns to Scale Frontier
The full proportionality assumption includes any non-observed production point that is
proportional to some data points in the FDH. This assumption is in accordance with the
original DEA model (CCR) and the Farrell efficiency measure. It is critical to know that
in DEA, we assume that a linear combination of DMUs is possible and real. The CCR
frontier contains the other frontiers and so, if a DMU is on the CCR frontier, it will also
be on the FDH and BCC frontiers. The reverse, however, is not true.
2.5 Probabilistic Frontier Estimators
The nonparametric deterministic estimators envelop all the data points and so are very
sensitive to noise. They may be seriously affected by the presence of outliers (units
which are significantly different from others), as well as data errors, which may lead to a
substantial underestimation of the overall efficiency scores. Therefore, in order to assure
credibility of the efficiency indices, it is important to adopt some additional methods
to correct for such discrepancies. Only then may one hope to obtain estimators that
could be useful for the decision-making process. We look at probabilistic models for this
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 22
purpose.
In probabilistic models, we leave some room for errors in the PPS. The production
process is defined with a joint probability. For (x, y), a realization of random variables
of input/output (X, Y ), to be in the PPS, it should be either a dominant point (on the
frontier) or a dominated point (enveloped by the frontier):
F (x, y) = Prob(X ≥ x,Y ≤ y) > 0 dominant point;
H(x, y) = Prob(X ≤ x,Y ≥ y) > 0 dominated point.
Based on the n observed data points, the FDH estimator of the PPS,H, the frontier,F ,
and the efficiency score estimator,θ for an input-oriented case given by:
Fn(x, y) =n∑i=1
Prob(Xi ≥ x,Yi ≤ y), (2.19)
Hn(x, y) =n∑i=1
Prob(Xi ≤ x,Yi ≥ y), (2.20)
θn(x, y) = infθ|Fn(θ · x|y) > 0, and (2.21)
θn(x, y) = infθ|Hn(θ · x|y) > 0 (2.22)
It has been proven by Park et al. [Park 00] that θ(x, y) is a consistent estimator of
θ(x, y) with the convergence rate of n−1/(m+s) where m=number of inputs and s=number
of outputs.
2.5.1 Partial m Frontier
Cazals et al. [Caza 02] introduced the concept of partial frontiers (order-m frontiers)
with a nonparametric estimator that does not envelop all the data points. While keeping
its nonparametric nature, the expected order-m frontier does not impose convexity on
the production set and allows for noise (with zero expected values). For example, to
measure the input efficiency of (x, y), we pick m random DMUs that fit the criterion
of producing at the same level or better than y. We estimate the FDH PPS and the
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 23
efficiency score, based on those m DMUs. (x, y) is then compared to a set of m peers
producing more than its level y with the expectation of the minimal achievable input
being the benchmark. This would replace the absolute minimal achievable input given
by:
θm(x, y) = infθ|(θ · x, y) ∈ Ψm(y),
Ψm(y) = (x∗, y∗)|x∗ ≥ xi=1...m, y∗ ≤ y.
In the probabilistic model m DMUs are represented by random variables, so do Ψm(y)
and θm(x, y). The input efficiency score, on average, is then given by:
θm(x, y) = E(θm(x, y)|Y ≥ y). (2.23)
Therefore, instead of looking for the lower boundary, input orientation frontier, the
order-m efficiency score can be viewed as the expectation of the minimum input efficiency
score of the unit (x, y), when compared to m units randomly drawn from the population
of units producing more outputs than the level y. This is a less extreme benchmark
for the unit (x, y) than the absolute minimal achievable level of inputs. The order-m
efficiency score is not bounded by one: a value greater than one indicates that the unit
operating at the level (x, y) is more efficient than the average of m peers randomly drawn
from the population of units (n observed DMUs) producing more output than y.
θm,n(x, y) = θn(x, y) +
∫ ∞θn(x,y)
(1− FX(ux|y)m)du
lim θm,n(x, y) = θn(x, y), as m→∞
For a finite m, the frontier may not envelop all data points. The value of m may
be considered as a trimming parameter and as m increases, the partial order-m frontier
converges to the full-frontier. It is shown that by selecting the value of m as an appropriate
function of n, the nonparametric estimator of the order-m efficiency scores provides a
robust estimator of the corresponding efficiency scores, sharing the same asymptotic
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 24
properties as the FDH estimators, but being less sensitive to outliers and/or extreme
values. In the literature, numerical methods like the Monte Carlo procedure are being
used instead of evaluating multivariate integrals. In chapter 6 of this work, we will return
to the idea of m-frontiers and build our own Monte Carlo method to derive the frontier
estimator to deal with nonlinearity.
2.5.2 Quantile Frontier
Aragon et al. [Arag 05] proposed an alternative approach to order-m partial frontiers by
introducing quantile-based partial frontiers. The intention is to replace the concept of
the “discrete” order-m partial frontier by a “continuous” order− α partial frontier, where
α ∈ [0, 1] corresponds to the level of an appropriate nonstandard conditional quantile
frontier. This method is more robust in relation to the effects of outliers. The original
α-quantile approach proposed in [Arag 05] was limited to one-dimensional input for the
input oriented frontier and to one-dimensional output for the output oriented frontier.
Daouia and Simar [Daou 07] developed the α-quantile model for multiple inputs and
outputs. Similar to equation 2.22, α− quantile input efficiency is defined as:
θα(x, y) = infθ|H(θ · x, y) > 1− α (2.24)
Unit (x, y) consumes, by a ratio α, less than all other units producing output larger than,
or equal to, y and consumes, by a ratio (1−α), more than remaining units. If θ(x, y) = 1,
we will say that the unit is input efficient at the level α. Clearly, when α = 1, this is
the same as the Farrell-Debreu input efficiency score, sharing the same properties of the
FDH estimator, but since it does not envelop all the data points, it will be more robust
in relation to extreme and/or outlying observations [Daou 07].
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 25
2.5.3 Practical Frontiers
As we have seen, DEA is very data oriented and it builds the PPS, based on certain
assumptions. DEA does not have any benchmark to rank the efficient units against and,
since its vision is limited to the sampled data, it cannot perceive any potential improve-
ment beyond the already identified efficient DMUs. Moreover, although the fundamental
assumptions hold, on average, in practical cases, there might be exceptions for some
entities due to either managerial or natural restrictions. For example, although we may
assume that any linear combination of DMUs could be realized, we cannot guarantee if
any inefficient DMU projected to a target, can imitate that production by changing its
inputs/outputs accordingly.
One of DEA’s limitations is associated with its inability to provide any further in-
sight into the DMUs on the frontier. However, there might be a possibility for the
DEA-efficient DMUs to improve and it is important for management to set targets for
their efficient units if the organization is to advance as a whole. Sowlati and Paradi
[Sowl 04] looked at the problem and formed a new practical frontier, by possible changes
in the inputs/outputs of the already efficient DMUs. They introduced a novel linear
programming approach to create those hypothetical DMUs, and formed a new practical
frontier. Other researchers have worked on a practical frontier by introducing weight
restrictions on inputs/outputs to prevent DEA from setting a practically impossible tar-
get on the frontier for an inefficient unit. The 73rd Annals of Operations Research was
dedicated to “extending the frontiers of DEA” and it includes various papers on the issue
[Lewi 97]. In our work, we will use numerical methods to generate hypothetical DMUs
to build the practical frontier that dominates the conventional DEA frontier.
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 26
2.6 Linkage between Data Envelopment Analysis and
Ratio Analysis
A number of researchers have studied DEA and RA and noted their positive and negative
aspects. While some papers have compared DEA and RA [Cron 02], [Fero 03], others
attempted to combine or relate the two methods [Bowl 04],[Chen 02b, Wu 05, Desp 07,
Chen 07]. In this section we study how the two techniques compare in the DEA context
and how they can be combined for better results.
2.6.1 Comparing DEA and RA
There have been several studies tat compare DEA and RA. Below we provide a sum-
mary of those studies. Cronje [Cron 02] compared the use of the DuPont system with
DEA in measuring the profitability of local and foreign-controlled banks in South Africa.
The DuPont system [Gall 03] is an analysis technique to determine what processes the
company does well in and what processes can be improved by focusing on the interrela-
tionship between return on assets, profit margins and asset turnover. The results show
that DEA gives a more accurate classification because it provides a combined comparison
of the performance of the banks with regard to different financial ratios, beyond the three
ratios involved in the DuPont system.
Feroz et al. [Fero 03] tested the null hypothesis that there is no relationship between
DEA and traditional accounting ratios as measures of the performance of a firm. Their
results reject the null hypothesis indicating that DEA can provide information to analysts
that is additional to that provided by traditional ratio analysis. They applied DEA to
the oil and gas industry to demonstrate how financial analysis can employ DEA as a
complement to ratio analysis.
Thanassoulis et al. [Than 96] studied DEA and performance indicators (output to
input ratios) as alternative instruments of performance assessment, using data from the
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 27
provision of prenatal care in England. They compared the two aspects of performance
measurement and target settings. As far as performance measures are concerned, in a
typical multi-input multi-output situation, various ratios should be defined. However,
this makes it difficult to gain an overview of the unit’s performance, particularly when
the different ratios of that unit do not agree on the unit’s performance, as is often the
case; yet, selecting only some ratios can bias the assessment. They found that their DEA
and individual ratios agree weakly on unit performance and this is because DEA reflects
the overall efficiency, while RA only reflects specific ones. On the second aspect, namely,
target setting, DEA identifies input/output levels that would render a unit efficient.
Ratio-based targets may result in unrealistic projections because they are derived with
reference to one input and one output at a certain point in time, regardless of the rest of
the input-output levels. However, the authors believe that ratios could give some useful
guidance for further improvements of the efficient units in DEA.
Finally, Bowlin [Bowl 04] used DEA and ratio analysis to assess the financial health
of companies participating in the Civil Reserve Air Fleet — an important component
of the Department of Defense’s airlift capability — over a 10-year period. He employed
DEA and then tried to explain the observations based on ratio analysis. He believes the
two methods together gave a better insight to the study.
2.6.2 Combining DEA and RA
There are several studies that attempted to combine or relate the two methods. As
proved in [Siga 09], ratio analysis (RA) is the same as the CCR model when the DMUs
have only a single input and a single output. Below we provide a summary of the studies
combining DEA and RA.
Chen and Agha [Chen 02b] characterized the inherent relationships between the DEA
frontier DMUs and output-input ratios. They showed that top-ranked performance by
ratio is a DEA frontier point and DEA subsumes the premise of the RA, however, it fails
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 28
to identify all types of dominating units, as DEA does.
Gonzalez-Bravo[Gonz 07] proposed a Prior-Ratio-Analysis procedure which is based
on the existence of a relationship of individual ratios to DEA efficiency scores. He listed
the efficient units whose efficiencies are overestimated by DEA because they perform
highly in a single dimension, and the inefficient units whose efficiencies are underesti-
mated because they perform reasonably well in all the considered dimensions, but do not
stand out in any of them.
Being motivated by Chen and Agha’s [Chen 02b] paper, Wu and his colleagues
[Wu 05] proposed an aggregated ratio analysis model in DEA. This ratio model has
been proven to be equivalent to the CCR model. However, we believe the inclusion of all
possible ratios in the model does not necessarily make sense and the number of possible
ratios grows in size exponentially, as the number of inputs-outputs increase. To illustrate
this, consider a three-input, three-output case. We will have (23 − 1) aggregated inputs
and (23 − 1) aggregated outputs, so the model optimizes 7 × 7 = 49 aggregated ratios,
where some of them do not represent a meaningful concept, e.g. the ratio of cured people
in ICU to the number of children admitted in ER. The authors have also proven that
a subset of all the possible aggregated ratios is also equivalent to the CCR model and,
in our example, the number of variables decrease significantly, which is a substantial
improvement. However, the necessities of including unrelated ratios still remain unad-
dressed. To deal with some meaningless ratios Despic et al. [Desp 07] proposed a DEA-R
efficiency model, in which all possible ratios (output/input) are considered as outputs.
This model enables the analyst to easily translate some of the expert opinions into weight
restrictions, in terms of ratios, thereby creating an immediate communication between
the experts and the model.
In another interesting study, Chen and McGinnis [Chen 07] showed there is a bridge
between ratio efficiency and technical efficiency. Therefore, RA(m,s,k), where m is a one
of the input elements from input X, s is one of the outputs from output Y and k repre-
Chapter 2. DEA: Theory, Assumptions and Realization Techniques 29
sents the specific DMU, is a product of seven different component measurements. They
are technical efficiency, technical change, scale efficiency, input slack factor, input sub-
stitution factor, output slack factor and output substitution factor. Technical efficiency,
technical change and scale efficiency are DMU dependent only, i.e. for DMUk, they will
be the same, no matter what input m and output s are selected for RA(m,s,k). Input
slack factors and input substitution factors are DMU and input dependent. For a par-
ticular RA(m,s,k), they depend on the selection of input m and DMUk but not output
s. However, output slack and substitution factors are functions of DMUk, input m and
output s. This relationship provides a basis for concluding that the conventional partial
productivity metric is not a proper performance index for system benchmarking. This is
because it depends on other effects, in addition to the system-based technical efficiency
between a given DMU and a “benchmark” DMU. Furthermore, RA(m,s,k) is the product
of technical efficiency and the other six factors which are all less than, or equal to, one.
Therefore, RA(m,s,k) being close to one indicates that all seven factors should be close to
one, and, in fact, larger than RA(m,s,k). This property partly explains why the DMUs
with the largest output-input ratio will be technically efficient when their RA equals one
[Chen 02a].
Other researchers have started to use the financial ratios as inputs and outputs in
DEA, with an expectation that they can get the best out of that. In a Magyar Nemzeti
Bank Working Paper, Hollo and Nagy [Holl 06] employed ratios in their production model
to assess 2459 banks in the European Union. Hollingsworth and Smith, for the first time,
pointed out the inaccuracy of the CCR model [Holl 03], when data is in the form of
ratios. Then Emrouznejad et al. [Emro 09] examined the problem of ratios in more
detail and proposed a series of modified DEA models. We focused on the full efficiency
and complemented their model by adding a second phase and avoided nonlinearity with
algebraic transformations [Siga 09]. In this thesis we develop models suitable to employ
ratio variables in DEA and offer solutions to operationalize the developed concepts.
Chapter 3
Literature review of non-oriented
models
As discussed in the introduction this chapter forms the first part of the literature review
required for our work. It has been essential for us to examine the non-oriented models
in the literature and see if they can be altered in anyway to take in ratio variables.
In addition to make our model comparable, we wanted to understand the way they
others have operationalized their models and the attributes they offer. This chapter
discusses the existing non-oriented DEA models and concepts, their realization methods
and characteristics.
“Non-oriented models” is a general term associated with the DEA models that mea-
sure efficiency by simultaneously decreasing the input and increasing output. In the
literature of DEA, there are only a small number of non-oriented models, with different
applications. Non-oriented models differ from each other in how the distance to the best
practice is calculated, how much importance (weight) is put on every input or output,
and how the final score is interpreted. In this chapter, we review all the non-oriented
models in the literature and examine their properties such as units and translation in-
variance, the efficiency score bounds, and their computational complexity. This is the
30
Chapter 3. Literature review of non-oriented models 31
starting point to understand the field and a guide for us in designing our proposed model,
knowing what characteristics we need to include and how our model stands out in relation
to other models.
3.1 Russell Graph Efficiency Model
Fare et al. [Fare 85] built upon their suggested models, named input and output Russell
measure of technical efficiency to combine both. Their first idea was presented in 1978
[Fare 78] in which they extended the Farrell’s [Farr 57] one-input one-output efficiency
measure to multiple-input case. Their model overcomes the four shortcomings of Farrell’s
input measure of efficiency, the two important ones being: a) The score is one if, and
only if, the input set is technically efficient; and b) it is monotonic so an increase in input
should inversely affect the efficiency score.
They operationalized their concept by converting the radial measure to a non-radial
measure. L(y) consists of X that can produce at least y. For the input-oriented problem
and one-output case, efficiency measures defined by Farrell and Fare and Lovell [Fare 78]
are given by the following where strictly positive input elements of input x are sorted
from 1 to l and the rest are zero:
minl∑1
θil|(θ1X1, θ2X2, ...θlXl, 0, ..0) ∈ L(y)
minλ|λX ∈ L(y).
Zero elements do not come into play in efficiency calculations. It is clear that in Fare and
Lovell model for non-zero inputs, slacks are eliminated as well. They later expanded this
to a multi-output case [Fare 83b]. In a similar fashion, they defined the Russell output
technical efficiency [Fare 85] and defined that by max∑ φi
o|((Y1.φ1, Y2.φ2, ...Yo.φ0, 0, ..0) ∈
L(x), φi ≥ 1. L(x) is the set consisting of output Y that use at most x. Combining the
two input and output efficiencies, they produced the input-output, or as they called it,
Chapter 3. Literature review of non-oriented models 32
“graph” efficiency measure [Fare 85].
The Russell graph efficiency considers both input and output simultaneously and sup-
posedly the unit is efficient if, and only if, R = 1. However, this property is questioned,
as we discuss later. The downside is that R < 1 does not convey a readily meaningful
message to management, and it is non-linear. Recently, Levkoff et al. [Levk 12] pointed
out the model’s failure to distinguish between efficient and inefficient units at the bound-
ary of the output space. In addition, it does not satisfy weak monotonicity at inefficient
units on the boundary (an increase in any output lowers the efficiency score). This tends
to create problems with zero values, in some outputs.
The way to compute the Russell graph efficiency is through the following simplified
nonlinear programming approach:
minR =
∑li=1 θi +
∑oi=1
1φi
l + o, l ∈ 1, ..,m, o ∈ 1, .., s (3.1a)
s.t.
n∑j=1
λj · xij ≤ θi · xik ∀i = 1, .., l (3.1b)
n∑j=1
λj · yij ≥ φi · yik ∀i = 1, .., o (3.1c)
λj ≥ 0 0 ≤ θi ≤ 1 φi ≥ 1. (3.1d)
It is not clear to us why the two input and output Russell measure objectives were not
added to each other, and averaged as a whole. It is worth mentioning that instead of
having an arithmetic mean of input contractions and an inverse harmonic mean of output
expansions, the objective is now an unweighted mean of aggregated input contractions
and inverse output expansions for positive input and outputs only. Zero inputs and
outputs do not come into play. Although not considered by the authors, with a simple
investigation, we can prove that the formulation is units invariant but not translation
invariant. To overcome the computational difficulty, whenever the goal is merely to group
the units into two Russell efficient and inefficient categories, and data is strictly positive,
Chapter 3. Literature review of non-oriented models 33
Cooper et al. [Coop 99a] devised the following model named MIP (measure of inefficiency
proportions) and proved a unit is efficient in the Russell model, if it is MIP efficient. The
optimization to be solved for MIP is given by:
maxλ,s±i
m∑i=1
sik−
xik+
s∑i=1
sik+
yik, (3.2a)
s.t.
n∑j=1
λj · xij − xik + si− = 0 i = 1..m (3.2b)
n∑j=1
λj · yij − yik − si+ = 0 i = 1..s (3.2c)
λj, s±ik ≥ 0. (3.2d)
Moreover, they have shown that if optimum output slacks to MIP equal zero, then the
two optimal objectives will be equal; of course, this does not hold for zero input slacks.
Cooper, Park and Pastor also provided an algebraic approximation for the Russell
measure under special circumstances: positive values ands+ikyik
< 1. They first transformed
Russell measure through algebraic manipulation and then used algebra again to approx-
imate the nonlinear objective into a linear version and guarantee it will be between zero
and one. Their attempt to develop a routine to solve nonlinearity caused by the sum of
fractions by algebraic approximation is worth further attention and work, but has not
been investigated to date, to our knowledge. The approximation for Russell is linear and
they have proved the following:∑mi=1 θi
∗+∑si=1
1φi
∗
m+s≈ 1 −
∑mi=1
sik∗
xik+∑si=1
s∗ikyik
m+s. The optimal
values of the two differ less than∑si=1
s∗ikyik
2
m+s. It is also worth mentioning that Ruggiero et
al. [Rugg 98] produced a weighted Russell measure for the one sided case. The intention
is to give priority to a few variables preferred by management. The outcome of the model
depends on the right choice of weights and if the relative weights are biased, distortions
might be introduced. For the one-output case, ordinary least square regression can be
used to choose weights, while for the multi-output case, the canonical regression analysis
Chapter 3. Literature review of non-oriented models 34
is the optimum method. For the two sided case, there is no suggestion on how to choose
weights.
3.2 Refined Russell Graph Efficiency Model
As briefly mentioned above, Levkoff et al. [Levk 12] established that the Russell graph
measure does not behave as expected at the output boundary, and, in particular, once
the output has zero elements, the inefficient unit might be classified as efficient with score
one, while increasing those output levels from zero will decrease the efficiency score. This
is because it brings the once ignored output into play. Then, for two outputs, the one with
the lower level of output (zero) has a higher efficiency score. This can be shown using a
simple example of one-input two-outputs case; please see [Levk 12]. They have proposed
the following to rectify the problem: instead of excluding yi = 0 from the formulation,
the zero elements of outputs are excluded only if any increase in that element takes the
unit outside the PPS, assuming that the PPS is defined by technology, T . They defined
an indicator function to differentiate between efficient and inefficient outputs with zero
values (on the frontier) as follows:
ψj(x, y, T ) = 1 if yj ≥ 0 oryj = 0∧ < x, (y1, ..yj + ε+ ys >∈ T for some ε ≥ 0
ψj(x, y, T ) = 0 if yj = 0∧ < x, (y1, ..yj + ε+ ys >/∈ T ∀ε
δ(xi) = 0 if xi = 0 and δ(xi) = 1 if xi ≥ 0.
The modified Russell measure is then defined as
infθ,φ
∑i δ(xi) θi +
∑j ψj(x, y, T ) φ−1
j∑i δ(xi) +
∑j ψj(x, y, T )
|(x θ, y φ) ∈ T
.
Computationally formulating the above is not easy. Indicator functions ψj depend on
infinitesimal comparisons ε, which are called the non-Archimedean element. It is assumed
to be smaller than any positive number and remains so, even if it is multiplied by a
large number. The constraint set is not closed and the minimum does not always exist.
Chapter 3. Literature review of non-oriented models 35
The authors suggested replacing zero outputs with ε if the technology is known, then
calculating ψj in the same fashion as before, if the technology is known and convex, then
shadow prices can help. For zero output elements, if the shadow price is positive for any
shadow price vector supporting < x, y >, then ψj = 0, otherwise, ψj = 1. However,
most of the time, the technology is not known and will be estimated using data points
available, as is typically with DEA. The difficulty still lies in getting ψj right. The authors
suggest calculations in three steps: In step 1, for each selected DMU, the zero outputs
are examined to see if they are efficient or not, and this is done by replacing zeros in
output with small ε and solving the following:
minR =l∑
i=1
θi +s∑i=1
φi, l ∈ 1, ..,m, (3.3a)
s.t.
n∑j=1
λj · xij ≤ θi · xik ∀i = 1, .., l (3.3b)
n∑j=1
λj · yεij ≥yεikφi
∀i = 1, .., s (3.3c)
λj ≥ 0, 0 ≤ θi ≤ 1, 0 ≤ φi ≤ 1. (3.3d)
We take note of Set=i|yi = 0∧φ∗i ≤ 1 as they are output elements belonging to the
inefficient production vector. Step 2 involves finding the minimum for the numerator of
the modified function, by setting φi = 0 for zero outputs that are inefficient and solving
the following non-linear optimization:
minR =l∑
i=1
θi +∑i/∈set
φi l ∈ 1..m, (3.4a)
s.t.
n∑j=1
λj · xij ≤ θi · xik ∀i = 1..l (3.4b)
n∑j=1
λj · yεij ≥yεikφi
∀i /∈ set (3.4c)
Chapter 3. Literature review of non-oriented models 36
λj ≥ 0 0 ≤ θi ≤ 1 0 ≤ φi ≤ 1 (3.4d)
And finally, step 3 involves dividing the above objective function by l + o+ |Set|.
The above procedure is complicated since two steps involve nonlinear programming
and at each step, outputs with zero values have to be replaced with infinitesimal values.
3.3 Multiplicative Model (log measure)
In this model, instead of the “summation” used in the CCR model to build the virtual
outputs/inputs, “multiplication” is used [Char 82]. The initial formulation looks like:
maxµ,ν
∏si=1 y
µir0∏m
i=1 xνii0
, (3.5a)
s.t.∏si=1 y
µiij∏m
i=1 yνiij
≤ 1 ∀j = 1, .., n (3.5b)
µi, νi ≥ 1∀i. (3.5c)
Taking the logarithm of the above, we will get
maxµi,νi
s∑i=1
µiyr0 −m∑i=1
νixi0, (3.6a)
s.t.
s∑i=1
µiyij −m∑i=1
νixij ≤ 0 j = 1, .., n (3.6b)
µi, νi ≥ 1. (3.6c)
And the dual is
maxλ,s±i
m∑i=1
si− +
s∑i=1
si+, (3.7a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.7b)
Chapter 3. Literature review of non-oriented models 37
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.7c)
λj, si−, si
+ ≥ 0, (3.7d)
which is exactly like the additive model for log inputs and log outputs, whereby it mea-
sures log efficiency. Note that a DMU is efficient if, and only if, it has a log-efficiency
of zero. It can be proven that if DMUo appears in the optimal basic solution of (3.7a),
then DMUo is efficient. This is the same as what we had in the case of CCR where the
reference set is all efficient. The easiest way to solve (3.7a) is to continue to use the dual
log measure. For every DMU, only the right-hand side of the constraints will change.
In terms of computation, there is no need to compute efficiency for all observed DMUs,
because for each DMU, once we have the efficient subset, it is not required to solve the
LP (3.7a). This only needs to be solved for the DMUs not recorded so far. The model
is neither units nor translation invariant and the efficiency score below zero does not
convey any meaningful message other than “inefficient”. The score is not positive and
not bounded from below. This method is only useful to pinpoint efficient units (frontier)
rather than estimating the efficiency score of inefficient ones.
3.4 Invariant Multiplicative Model
In this model, Charnes et al. [Char 83] built upon their log measure model and enhanced
it to be units invariant. This is done by including a virtual input and output element for
DMUo equal to e to make the method units invariant [Char 83]. The exponents η, ξ are
used for this virtual input and output, respectively, where intensity variables µi, νi are
used for real inputs and outputs. So the formulation becomes:
maxµ,ν,η,ξ
eη∏s
i=1 yµii0
eξ∏m
i=1 xνii0
, (3.8a)
s.t.
Chapter 3. Literature review of non-oriented models 38
eη∏s
i=1 yµiij
eξ∏m
i=1 xνiij
≤ 1 ∀j = 1, .., n (3.8b)
η, ξ ≥ 0, µi, νi ≥ δ, ∀i δ > 0. (3.8c)
Scaling outputs by ai and inputs by bi, where ai, bi > 0, will transform the above to the
following:
maxµ,ν,η,ξ
eη∏s
i=1 yµii0
∏si=1 a
µii
eξ∏m
i=1 bνii
∏mi=1 x
νii0
, (3.9a)
s.t.
eη∏s
i=1 yµiij
∏si=1 a
µii
eξ∏m
i=1 xνiij
∏mi=1 b
νii
≤ 1 ∀j = 1, .., n (3.9b)
ξ, η ≥ 0 µi, νi ≥ δ∀i. (3.9c)
Given we have optimal solution to 3.8, a feasible solution to 3.9 can be formed with
the same objective value, so 3.9a≥3.8a. Similarly a feasible solution can be constructed
from the optimal solution to 3.8 with the same objective value which implies 3.8a≥3.9a.
Hence the optimal objective scores, have to be equal, and therefore the efficiency value
is invariant under change of units; however, keep in mind that the optimal values for
variables will not necessarily be the same. After taking the log and making this into
compact form, we will arrive at the following were the hat sign of input/output means
logarithm:
max η − ξ + µT Y0 − νT X0, (3.10a)
s.t.
ηeT − ξeT + µT Y0 − νT X0 ≤ 0, ξ, η, δ > 0; µT , νT ≥ −δeT . (3.10b)
Taking the dual will result in:
maxλ,s±i
m∑i=1
δ.si− + δ.
s∑i=1
si+, (3.11a)
s.t.
Chapter 3. Literature review of non-oriented models 39
n∑j=1
λj − θ+ = 1 (3.11b)
n∑j=1
λj + θ− = 1 (3.11c)
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.11d)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.11e)
λj, θ+, θ+, si−, si
+ ≥ 0. (3.11f)
Adding the first two equation will result in θ+ + θ− = 0 and because the two are non-
negative both should be zero. As a result the dual will be reduced to:
maxλ,s±i
m∑i=1
δ.si− + δ.
s∑i=1
si+, (3.12a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.12b)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.12c)
n∑j=1
λj = 1 (3.12d)
λj, si−, si
+ ≥ 0. (3.12e)
Therefore, in the log domain, the log input and output of DMUo are enveloped by the
convex combinations of log inputs and outputs. In the log domain and from a DEA
perspective, (3.12a) is based on variable returns to scale technology, as opposed to (3.7a)
initial constant return to scale model. Having the optimal solution to (3.12a), we can
write Y0 =∏n
j=1 Yλ∗j e−s∗j
j and because∑n
j=1 λj = 1, Y0 is a Cobb-Douglas function. The
same can be said for X0. It is useful to explain what a Cobb-Douglas function is. In
economics, the Cobb-Douglas production function usually represents the relationship
between two or more inputs producing one overall output. The function is of the form
Chapter 3. Literature review of non-oriented models 40
Y = A.LβKα, where Y, for example, is the real value of all goods produced, L is labour
input, K is capital input, A is productivity, and α and β are output elasticities, which
are constant and defined by technology. They tell us if the capital usage increases by a
percentage, how the output will be affected. If α + β = 1, then it is constant returns
to scale, meaning that doubling all the inputs will double the output, but if the sum is
less than one, then it is decreasing returns to scale. Finally if α + β is greater than one
it is increasing returns to scale. For the case of α + β = 1, α and β show the input
shares of output. Because only positive λjs come into play in Y0 =∏n
j=1 Yλ∗j e−s∗j
j , we
can say DMUj with λj > 0 is efficient and is part of frontier (efficient facet) for DMU0.
This represents a new method for estimation of piecewise the Cobb-Douglas production
function directly from empirical data. Very recently, Cook and Zhu [Cook 14] have built
a model based on the invariant multiplicative DEA model, which enables the ranking of
units or, as they call it, cross-efficiency. Ranking in DEA is often the subject of criticism
because it depends on an optimal set of weights, which is not unique. The authors have,
however, proved that using the above model for calculating cross-efficiency will lead to a
unique score, without the need to impose secondary goals. Moreover, it is linear. Since
ranking is not in the scope of this work, we do not cover the details here; this was just
mentioned to show one of the capabilities of this model.
3.5 Pareto efficiency test model (Additive)
The Pareto efficiency test model, which was later labeled as additive, was developed by
Charnes et al. [Char 85]. Pareto efficiency is when no element of output could be bigger
without producing less of another; or, from the consumption point of view, decreasing
an input will not be possible without increasing another. Nonzero slacks are identified
as the source of inefficiency. It is worth mentioning that the meaning or amount of loss
or gain is not considered. One unit of loss in output 1 might result in three units of
Chapter 3. Literature review of non-oriented models 41
gain in output 2, which, although overall, might be preferable, is not considered Pareto
improvement.
Given the empirical points and observed units, the empirical production set (EPS) is
defined as the convex hull of observed data, and it is extended to the empirical production
possibility set (EPPS) with inputs from the production set and outputs not greater than
those in the production set (disposability of outputs). Let EPPS’ be the set corresponding
to EPPS. A frontier function is defined as f(x) = max y, (x, y) ∈ EPPS’. It is proven
that f(x) is concave and piecewise linear on EPS. The Pareto-efficient empirical frontier
function is determined by first pinpointing the Pareto-efficient units from n observations.
Then the function is defined on the convex hull of the inputs by the convex combination of
the outputs. Authors had shown before that the necessary and sufficient condition for a
point x∗ to be Pareto efficient is to be the optimal solution to the following: min∑gk(x)
subject to gk(x) ≤ gk(x∗),∀k where gk(x) is a function representing our objectives. Here,
our goal is to achieve technical efficiency and to maximize outputs and minimize inputs,
as given by the following optimization problem.
minλ,s±i
m∑i=1
n∑j=1
λj · xij −s∑i=1
n∑j=1
λj · yij, (3.13a)
s.t.
n∑j=1
λj · xij − xik + si− = 0 i = 1, ..,m (3.13b)
n∑j=1
λj · yij − yik − si+ = 0 i = 1, .., s (3.13c)
n∑j=1
λj = 1 (3.13d)
λj ≥ 0. (3.13e)
Because the solution will not change if we add constants to the objective, they have
cleverly rewritten the above in the following form, thereby giving birth to the additive
Chapter 3. Literature review of non-oriented models 42
model. The intention behind the above formulation is mainly to obtain a test for unit k:
if the optimal objective is zero, then unit k is optimal, thus a Pareto-efficient point. The
authors at this point were not concerned with relative efficiency and score, rather, they
just wanted to identify efficient points [Char 85].
minλ,s±i
−m∑i=1
si− −
s∑i=1
si+ (3.14a)
s.t.
n∑j=1
λj · xij − xik + si− = 0 i = 1..m (3.14b)
n∑j=1
λj · yij − yik − si+ = 0 i = 1..s (3.14c)
n∑j=1
λj = 1 (3.14d)
λj ≥ 0 (3.14e)
The linear program in (3.14) maximizes the L1 distance of a point in the convex hull
of observation to (xk, yk). In the basic additive model, units with zero slacks are the
efficient ones and inefficiencies are measured in terms of the summation of all slacks.
Since inputs/outputs have different scales, trying to merely maximize the size of the
slack, regardless of the percentage of change required, or the value of that variable,
might result in unwise targets. For example a unit wasting 500 ml of water, and 0.5
grams of gold in making an alloy will be recommended to use 500 ml of water because
simply 500 > 0.5 and the goal is to shrink the waste in size.
The Additive model is not units invariant. To achieve a units invariant measure,
the authors modified the objective by dividing each slack by the corresponding input or
output of unit k to achieve a units invariant state. They also used a scaler, δ, to map the
objective onto a desired range. The authors suggested, for instance δ = 10 1m+s
will make
the objective between zero and −10. However, they were not right and the range would
be between zero and −∑ymaxi − ymini . Although for one output case the Pareto-efficient
Chapter 3. Literature review of non-oriented models 43
empirical frontier function is isotonic, this is not the case for multiple outputs. This
means that the frontier function is not monotonically increasing. Thus moving on the
frontier, the one with more input might not have, strictly, more output, rather, it could
have fewer of some outputs and more of others and still be Pareto efficient. However, we
can always find a cone of directions in the output space, on which the output projections
are isotonic. As already mentioned, in the formulation (3.14), the production possibility
set was based on convexity and disposability of outputs. If one chooses to add other
assumptions like disposability of inputs, as in in the BCC, the extended frontier is not
only composed of units with zero slacks but also with ones that have slacks in, at most,
n− 1 inputs and m− 1 outputs. If they have slacks in all inputs or outputs, they cannot
be on the frontier because they will have a radial efficiency score less than one. The other
issue is that since the goal is to maximize the slacks, the unit is projected to the farthest
part of the frontier, which might not necessarily be in the vicinity of the unit. Then the
only good outcomes will be those with zero slacks, are fully efficient and, of course, are
part of the frontier.
3.6 Extended Additive model
By switching from minimization to maximization in the above Pareto efficiency model,
Charnes et al. [Char 87] could measure inefficiency. Further to their idea of assigning
variable weights to the slacks to achieve units invariance, they tried to correct the range
issue by a post-optimization treatment. Weights, as suggested above, are the input and
output values of the unit being evaluated, but please bear in mind that the input and
output values need to be strictly positive. The extended additive model is formulated as
the LP below:
maxλ,s±i
m∑i=1
s−ixik
+s∑i=1
s+i
yik, (3.15a)
s.t.
Chapter 3. Literature review of non-oriented models 44
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.15b)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = 1, .., s (3.15c)
n∑j=1
λj = 1 (3.15d)
λj ≥ 0. (3.15e)
The other point is that this model is not translation invariant and cannot handle zero
or negative data. The score is not bounded by zero and one, and has no natural inter-
pretation as relative efficiency. Green et al. [Gree 97] suggested an intuitive objective,
which is bounded by zero and one, despite various routes that can be tried to transform
the above score to a meaningful measure. They suggested thinking of efficiency as the
ratio of the current state to the best practice. In this case output efficiency=ykj
ykj+s+j
and
input efficiency=xkj−s−jxkj
. The objective measures inefficiency which is 1-efficiency. For
the output inefficiency, we gets+j
ykj+s+j
and for the input,s−jxkj
. The input part is the same
as the units invariant extended model but the output part is slightly different. This way
the objective is bounded between zero and one but the cost becomes nonlinear. Cooper
el al. [Coop 99a] later decided to use the idea but implemented it ex post facto. This
is, rather than solving a nonlinear case, they used the idea of Green et al. [Gree 97] to
map the score after the additive model is solved. They generated a meaningful efficiency
measure, which is bounded by zero and one for this model and is presented below:
0 ≤ 1
m+ s
m∑i=1
s−∗ixik
+s∑i=1
s+∗i
yik≤ 1
yik = yik + s+i ∀i = 1, .., s.
This score is calculated after the original format is solved and the optimum slacks are
known. To keep the score below one, Y has been changed to Y , as Green et al. [Gree 97]
suggested. Although the input’s maximum slack cannot be bigger than the input itself
Chapter 3. Literature review of non-oriented models 45
and the first term is bounded by m, output shortfalls can be bigger than the output
itself, and the second term can be bigger than s, if Y was used (this happens if the target
produces more than double the present value). Finally, to make the score a real measure
of efficiency (score one=100 efficiency), we subtract it from one:
0 ≤ 1− 1
m+ s
m∑i=1
s−∗ixik
+s∑i=1
s+∗i
yik≤ 1.
3.7 Constant Weighted Additive Model
Pastor changed the objective in additive model by multiplying the input excess and
output shortfalls by some constant nonnegative weights [Past 96]. The LP is given by:
maxλ,s±i
m∑i=1
ω−i · s−i +s∑i=1
ω+i · s+
i , (3.16a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.16b)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.16c)
n∑j=1
λj = 1, (3.16d)
λj ≥ 0. (3.16e)
He has presented theories which can be used to guide weight choices in a very flexible
manner. For instance, one can assign weights to the slacks in the objective function
and leave the constraints unaltered or, equivalently, one may assign these weights to the
constraints, while leaving the objective function unaltered. The weights are chosen case
by case, and the main intention is that such weights can give different inputs/outputs
yet obtain the same advantage. In other words, the weights will mitigate the problem we
encounter in the above example of water and gold. The model is translation invariant,
but no weights exist that make the model units invariant [Love 95a, Past 99a, Coop 95].
Chapter 3. Literature review of non-oriented models 46
This is, in particular, due to the theory he proved in additive model scaling, where an
input/output is the equivalent of leaving the input/output unaltered and scaling the
corresponding slacks in the objective function. This tells us that there does not exist a
weighted additive model with constant weights that gives the same objective value if any
of the input/output variables are scaled.
To give a meaning to the objective, after finding the optimal slacks from the above for-
mulation, Pastor suggested calculating the following, which reflects the relative efficiency
and score of 1.0 means fully efficient.
0 ≤ 1− 1
m+ s
(m∑i=1
s−∗ixik − xi
+s∑i=1
s+∗i
yi − yik
)≤ 1
xi is the minimum of all xis and yi is the maximum of yi. It is not clear to us why
the weights in the weighted additive model are not replaced by 1m+s
1xik−xi
and 1m+s
1yi−yik
,
respectively, in the first place. If this happens, then the model will become translation
invariant too and negative data would not be a problem. This objective is what Cooper
et al. suggested to be used in the RAM model to increase the discrimination power of
RAM [Coop 99a]. Later, Cooper et al. took up this suggestion and introduced the BAM
model, as we will discuss later [Coop 11].
3.8 Normalized Weighted Additive Model
Lovell and Pastor [Love 95b] retained the constraints of the extended additive model
[Char 87] but introduced an objective with variable weights, which was dimensionless
and also bearing the desired translation invariance property.
minλ,s±i
−m∑i=1
1/σ−i · s−i +s∑i=1
1/σ+i · s+
i , (3.17a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.17b)
Chapter 3. Literature review of non-oriented models 47
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.17c)
n∑j=1
λj = 1 (3.17d)
λj ≥ 0. (3.17e)
Where σ+i and σ−i are the sample standard deviation of input (i = 1, ..,m) and output
(i = 1, .., s). It is rare, but worth mentioning, that if the sample standard deviation of
a variable is zero, then that can be completely removed from formulation, because if a
variable is constant for every unit, it reflects no change. The model is proved to be both
translation and units invariant. This would be a perfect model if the score is bounded
by unity.
3.9 Global Efficiency Measure (GEM)
The extended additive model would classify a unit as efficient if the objective function is
zero, and inefficient when it is greater than zero. It is units invariant but the score does
not make relative sense. In an attempt to make the objective function (score) meaningful,
relative to the others, Lovell et al. [Love 95a] retained the constraints but proposed a
new fractional objective, in a way to be bounded between zero and one. In their original
paper, they only considered an output oriented case and omitted inputs, not only from
objective but also from the constraints, claiming input constraints to be redundant. We
have built the following, including the inputs, also based on the same methodology.
minλ,s±i
[1 +
1
m
m∑i=1
s−ixik
+1
s
s∑i=1
s+i
yik
]−1
, (3.18a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.18b)
Chapter 3. Literature review of non-oriented models 48
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.18c)
n∑j=1
λj = 1 (3.18d)
λj ≥ 0. (3.18e)
Because an optimal solution for the extended model will be an optimal solution for GEM,
to avoid solving a nonlinear fractional program,in practice, the GEM’s approach is first
to find the slacks by solving the extended model, which is a simple linear program. Then it
uses the slacks to calculate a score following this formulation:[1 + 1
m
∑mi=1
s−∗ixik
+ 1s
∑si=1
s+∗iyik
]−1
.
The authors proved that this measure
• is greater than zero and bounded by one;
• equals to one if, and only if, the unit is fully efficient;
• is strictly monotonic;
• is units invariant.
Our observation is that the lower bound of the efficiency measure (objective) is 11+φi
where φi =∑s φis
and φi is the scalar of shortfalls of each yi. The efficiency measure here
is not translation invariant and cannot handle negative data.
3.10 Enhanced Russell Graph Efficiency Measure (en-
hanced GEM)
In the Russell graph measure, Fare et al. [Fare 85] averaged the individual input and
output efficiencies. Prior to that, they had developed Russell input and Russell output
measures, which were the arithmetic mean of positive inputs’ shrinking and the arithmetic
mean of positive outputs’ augmentations. Pastor et al. built the ratio of those averages
Chapter 3. Literature review of non-oriented models 49
[Past 99b]. They have assumed, however, that all the input/output variables are strictly
positive, (more limited than the Russell graph which allows zero values). Needless to say,
minimizing ensures the Russell input measure is minimized, while the Russell output
measure is maximized, which is the desired outcome. Their formulation is given by:
minR =
∑mi=1 θim∑si=1 φis
, (3.19a)
s.t.
n∑j=1
λj · xij ≤ θi · xik ∀i = 1, ..,m (3.19b)
n∑j=1
λj · yij ≥ φi · yik ∀i = 1, .., s (3.19c)
λj ≥ 0 θi ≤ 1 φi ≥ 1. (3.19d)
This measure makes it easy to separate and interpret the input and output efficien-
cies, on average. In other words, the unit should decrease the use of inputs by∑mi=1 θim
and increase outputs by∑si=1 φis
, on average, to become a point on the frontier. So the
ratio shows how much the DMU has been successful, in terms of transforming inputs to
outputs, on average.
The above formulation has several desirable properties. The measure is greater than
zero and less than, or equal to, one, and one means Pareto efficient (zero slacks). The
objective is isotonic and units invariant but not translation invariant. For the proofs,
please consult the appendix in [Past 99a]. From the computational aspect, the enhanced
Russell measure is computed more easily, compared to the original Russell graph measure.
Nevertheless, both are nonlinear although the formulation of enhanced Russell can be
linearized by some re-arrangements and smart change of variables. With a transformation
of variables using total slacks as:
θi =xik − s−ikxik
= 1− s−ikxik
i = 1, ..,m
φi =yik + s+
ik
yik= 1 +
s+ik
yiki = 1, .., s
Chapter 3. Literature review of non-oriented models 50
We will have:
minλ,s±i
1− 1m
∑mi=1
s−ixik
1 + 1s
∑si=1
s+iyik
, (3.20a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.20b)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.20c)
λj, s±i ≥ 0. (3.20d)
This is a fractional linear program and with the following change of variable method
(similar to Charnes and Cooper, 1962) it will lend itself to a linear program like:
minβ,t±ik,µ
β − 1
m
m∑i=1
t−ikxik
, (3.21a)
s.t.
β +1
s
s∑i=1
t+ikyik
= 1 (3.21b)
− βxik +n∑j=1
µj · xij + t−ik = 0 i = 1, ..,m (3.21c)
− βyik +n∑j=1
µj · yij − t+ik = 0 i = 1, .., s (3.21d)
µj, t±ik, β ≥ 0. (3.21e)
Where
β =
(1 +
1
s
s∑i=1
s+ik
yik
)−1
t−ik = β.s−ik i = 1..m
t+ik = β.s+ik i = 1..s
µj = β.λj j = 1..n
Chapter 3. Literature review of non-oriented models 51
The objectives are the same for the projections, and from the optimal solution to the
above, we can construct an optimal solution to the fractional program.
3.11 Range Adjusted Model (RAM)
Cooper et al. wanted a model to generate efficiency scores that were: a) bounded by zero
and one, and b) not only 1 had to mean 100% efficient but also zero had to mean fully
inefficient. In addition, they required units and translation invariance and isotonic to
hold. They retained the constraints of the additive model and defined a new objective,
whose formulation is called RAM (Range Adjusted Model)[Coop 99a]. The general form∑mi=1 ω
−i · s−i +
∑si=1 ω
+i · s+
i measures inefficiency (when it is zero, it means that the unit
is efficient). The objective is invariant to an alternative optimum solution. To make the
measure more comprehensive, Cooper et al. changed the above by subtracting it from
one (see the equation below) to measure efficiency. To accommodate all the above, they
chose the “range” of input and outputs for the weights in the following form:
0 ≤ 1− 1
m+ s
m∑i=1
s−∗ixi − xi
+s∑i=1
s+∗i
yi − yi≤ 1,
where xi and yi
are the minimum of all xis and yis and xi and yi are the maximum of xis
and yis respectively. The measure becomes zero in a case where every input and output
of the unit is the worst possible and the target is the overall best, which is a very rare
situation. In the unlikely event of a zero range, it is ignored and corresponding constraints
are omitted (they are redundant). The other property is the “ranking” potential of this
model as the authors claim because the weights are constant for every slack element.
Steinmann et al. examined the RAM and listed several limitations. They claim the
model is misleading because it classifies large and inefficient units as being less efficient
than small and inefficient units. For the proof, refer to [Stei 01]. The reader should bear in
mind that the ranges might require updating when new observations are introduced, since
Chapter 3. Literature review of non-oriented models 52
data orientedness is the nature of DEA, in general. The authors claim this formulation
is fairly robust even if the maximum and minimum of elements are breached [Aida 98].
The relatively large denominators in this measure, compared to the others, lead to higher
efficiency measures. To correct this, one suggestion is to replace xi with xik and yi
with yik to make denominators slightly smaller. The price of that is that the ranking
capability of RAM will be lost, since the divisor will differ at every element. The large
denominator also decreases the discriminating power of this measure, as detected by Aida
et al. [Aida 98], where a large group of water suppliers in Japan scored above 98%. As
we observed in GEM, this is because the full range between zero and one is not used. A
decade later, Cooper et al. [Coop 11] have addressed this in the BAM model, which we
discuss next.
3.12 BAM: a bounded adjusted measure
As mentioned before, RAM has little discriminating power and it is defined under VRS
technology only. It is easy to show that under non-increasing returns to scale, the RAM
score could be negative, as Cooper el al. showed in [Coop 11]. Authors have used the
suggestion made in the original paper to modify the objective denominators to L−i =
xi − xi and L+i = yi − yi to make them smaller and this will increase the discriminating
power. The BAM would lose the ranking potential carried by RAM and although BAM
is isotonic, it is not strongly monotonic. To make this available to other technologies
(CRS, NIRS, NDRS) they introduced bounds (extra constraints) to confine the score
to positive values. The general bounds are∑n
j=1 λj · xij ≥ xi and∑n
j=1 λj · yij ≤ yi,
in addition to the usual bounds on λs for constant, non-increasing and non-decreasing
returns to scale. For example,∑n
j=1 λj ≤ 1 for NIRS. It can be shown that the BAM-VRS
score never surpasses that of RAM. This model was later generalized by Pastor et al.
in tow attempts, [Past 13a, Past 13b] to ensure free disposability, provisions for partial
Chapter 3. Literature review of non-oriented models 53
bounds, as well as projection onto the strong efficient frontier under CRS technology.
3.13 Slack-based Measure
In his paper, Tone [Tone 01] suggested a similar model to GEM, as following:
minλ,s±i
1− 1m
∑mi=1
s−ixik
1 + 1s
∑si=1
s+iyik
, (3.22a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.22b)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.22c)
λj, s±i ≥ 0. (3.22d)
His approach in linearizing is what made it unique. By multiplying the numerator and
denominator by t and rearranging variables, the above can be transformed into the fol-
lowing linear formulation
min τ = t− 1
m
m∑i=1
S−ixik
, (3.23a)
s.t.
1 = t+1
s
s∑i=1
S+i
yik(3.23b)
n∑j=1
Λj ·Xij − t.xik + S−i = 0 i = 1, ..,m (3.23c)
n∑j=1
Λj · yij − t.yik − S+i = 0 i = 1, .., s (3.23d)
Λj, S±i , t ≥ 0. (3.23e)
He also studies the dual and shows that SBM deals with profit rather than cost.
Albeit the model is not translation invariant, Tone had considered zeros in input and
Chapter 3. Literature review of non-oriented models 54
output. For the zero inputs, the corresponding slack variable is omitted completely from
the formulation. For the zero outputs, if the unit does not have the facilities to produce
the output, this is omitted but if it has but not using it, then the zero should be replaced
by a small number.
We attempted to use Tone’s method to linearize the basic additive model we devel-
oped in [Siga 09] with no success. The reason is explained below, in two steps. In the
first step we have transformed the original model using∑n
j=1
∑ri=1 ∆ij =
∑ri=1 s
+i and∑n
j=1
∑qi=1 Φij =
∑qi=1 s
−i to the following:
min1− 1
m
∑qi=1
s−ixik− 1
m
∑mi=q+1
s−ixik
1 + 1s
∑ri=1
s+iyik
+ 1s
∑si=r+1
s+iyik
, (3.24a)
s.t.∑nj=1 λj · nxij∑nj=1 λj · dxij
− xik + s−i = 0 xik = nxik/dxik∀i = 1, .., q (3.24b)
n∑j=1
λj · xij − xik + s−i = 0 i = q + 1, ..,m (3.24c)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = r + 1, .., s (3.24d)∑n
j=1 λj · nyij∑nj=1 λj · dyij
− yik − s+i = 0 yik = nyik/dyik∀i = 1, .., r (3.24e)
n∑j=1
λj = 1 (3.24f)
λj ≥ 0. (3.24g)
Applying Tone’s technique to the above will result in:
min τ = t− 1
m
m∑i=1
S−ixik
, (3.25a)
s.t.
1 = t+1
s
s∑i=1
S+i
yik(3.25b)
Chapter 3. Literature review of non-oriented models 55
n∑j=1
Λj ·Xij − t.xik + S−i = 0 i = q + 1, ..,m (3.25c)
n∑j=1
Λj · yij − t.yik − S+i = 0 i = r + 1, .., s (3.25d)
n∑j=1
Λj = t (3.25e)
n∑j=1
Λj · σij +n∑j=1
Φij · dxij = 0 ∀i = 1, .., q (3.25f)
n∑j=1
Λj · wij −n∑j=1
∆ij · dyij = 0 ∀i = 1, .., r (3.25g)
Λj, S±i , t ≥ 0. (3.25h)
Although (t∗, S±i∗,Λ∗) gives us s±i
∗, λ∗i it does not guarantee Φ∗ij equals s−i
∗ · λ∗j · t∗. The
same is true for ∆∗ij.
3.14 Directional slack-based measure and distance
function
Considering the production possibility P (x) = X can produce Y and assuming weak dis-
posability, the directional distance function is defined as ~D(X, Y, g) = sup β : Y + βg ∈ P (x).
Weak disposability for undesirable outputs is a must since disposing of them should come
at a cost [Cham 96]. Chung et al. [Chun 97] have shown that if we take g = Y , the Shep-
hard [Shep 70] output distance function, D(X, Y ) = infθ : Y
θ∈ P (x)
, will become a
special case of directional distance function, D = 1
1+ ~D. β measures the technical ineffi-
ciency. The bundle g is arbitrary, and the intention is that with different choices of g,
you can define a model suitable to the application on hand. For example g could be in
the same direction as good outputs and the opposite side of undesirable outputs. Or,
the g of (−xk, yk) will project unit k to a point on the frontier with x∗ = (1− β).xk and
Chapter 3. Literature review of non-oriented models 56
y∗ = (1 + β).yk inputs and outputs. In LP form, it is given by:
max β, (3.26a)
s.t.
n∑j=1
λj · xij + βgxi ≤ xik i = 1, ..,m (3.26b)
n∑j=1
λj · yij − β.gyi ≥ yik ∀i = 1, .., s (3.26c)
n∑j=1
λj = 1 (3.26d)
λj ≥ 0. (3.26e)
We emphasize that although the model tries to maximize the radial input contraction
and output expansion along the preferred bundle simultaneously, it fails to detect all
the sources of inefficiency, because: a) it is dependent on the choice of direction, and
b) all the elements are forced to grow/contract at the same rate. This is clearly shown
in a numerical example in Ray’s book [Ray 00]. In an attempt to correct the above,
Fukuyama and Weber [Fuku 09] proposed a directional slack based inefficiency measure
(DSBI) which gauges the slacks according to gx and gy as shown below:
maxλ,s±i
1m
∑mi=1
s−igxi
+ 1s
∑si=1
s+igyi
2, (3.27a)
s.t.
n∑j=1
λj · xij − xik + s−i = 0 i = 1, ..,m (3.27b)
n∑j=1
λj · yij − yik − s+i = 0 i = 1, .., s (3.27c)
n∑j=1
λj = 1 (3.27d)
λj ≥ 0. (3.27e)
Chapter 3. Literature review of non-oriented models 57
If the optimum slacks for the directional distance formulation 3.26a are called t−i and
t+i for inputs and outputs, respectively, the objective of DSBI can be written as the
following, which shows the new measure is at least as directional distance function and
the two are equal, if no slack exists: β∗ +1m
∑mi=1
t−igxi
+ 1s
∑si=1
t+igyi
2
The model is translation invariant for a fixed directional vector and is units invariant if
gx = xk and gy = yk and it is monotonic and homogeneous of degree minus 1.
3.15 Graph Hyperbolic measure of efficiency
The hyperbolic graph efficiency extends the radial input and output measures by com-
bining them together. Fare el al. [Fare 85] (page 125) have mentioned why this is called
hyperbolic: “the model constraints the search for more efficient production planes to
a hyperbolic path along which all inputs are reduced, and all outputs are increased,
by the same proportion.” Suppose technology is defined as T = (x, y) : y ≤ f(x),
then the graph of technology is G = (x, y) : y =≤ (x) and (x, y) ∈ G and it is tech-
nically efficient. A hyperbolic efficiency score of (xk, yk) will be 1δ
if (1δxk, δyk) ∈ G.
For the CRS technology, the nonlinear formulation is easily linearized, and for the
VRS technology, Ray [Ray 00] suggested the use of first-order Taylor series approxi-
mation for f(δ) = 1δ, so at an arbitrary point δ0, the Taylor series would result in
f(δ) ≈ f(δ0) + f ′(δ0)(δ− δ0) = 2δ0−δδ0
. Then, assuming δ0 = 1, f(δ) ≈ 2− δ, the resulting
linear programming will be:
max δ, (3.28a)
s.t.
n∑j=1
λj · xij + δxik ≤ 2xik i = 1, ..,m (3.28b)
n∑j=1
λj · yij ≥ δ.yik ∀i = 1, .., s (3.28c)
Chapter 3. Literature review of non-oriented models 58
n∑j=1
λj = 1 (3.28d)
λj ≥ 0. (3.28e)
By construction, the observed unit and its efficient projection lie on a rectangular hy-
perbola. This is, of course, a limiting factor which does not allow for the full efficiency
projection. One possible improvement is using a different rate for inputs and outputs,
but this will not solve the problem. One other suggestion is to create another objec-
tive function based on available constructs here. For example Portela and Thanassoulis
[Port 02, Port 07] introduced the concept of geometric distance function GDF=(∏i φi)
1m
(∏r βr) 1
s
.
3.16 Benefit function
Benefit functions come from consumer theory rather than production theory and are de-
rived from personal preferences.[Luen 92],[Cham 96],[Fare 00]. Benefit function is suited
to maximizing the welfare of a group because the benefit of a preferred choice by indi-
viduals can be aggregated meaningfully. For any g, x, y with g 6= 0, g ∈ R+d , x ∈ X, y ∈ Y
let b(x, y, g) = max β : x− β g ∈ X,U(x− β g) ≥ y otherwise−∞, U is the utility
function defining the outcome of any decision or choice in X. g is a reference vector
defining the measure by which alternative bundles are compared. g is said to be a
good bundle if u(x + α g) ≥ u(x), given that x + α g ∈ X and α ≥ 0. The bundle
is weakly good if u(x + α g) ≥ u(x); if we assume U is monotonic, the ∀g ≥ 0 g is
weakly good. The benefit function has certain properties: (a)Monotonic with respect to
y, (b)b(g;x + αg, y) = α + b(g;x, y), (c)If g is weakly good, then b(g;x, y) ≥ 0 implies
U(x) ≥ y.
In contrast with distance function [Shep 70], d(x, y) = maxγ : U(x
γ) ≥ y
, which is
used in individual consumer theory, the benefit function has use in group welfare re-
lations. However, under appropriate assumptions, the dual of both functions will give
Chapter 3. Literature review of non-oriented models 59
the expenditure (cost) function. Because benefit function is easily transferable to LP
form, and with minor modification it can be applied to the production theory, it is of
interest to us. As a matter of fact, Chambers et al. [Cham 96] have extensively studied
the relations of benefit and distance functions in consumer theory and modified them
for use in production theory. Input distance function in production theory will look like
d(x, y) = maxγ : x
γ∈ L(y)
, where L(y) = x : xcan producey is a subsection of the
production possibility set. We can argue that for the utility function of xγ
to be greater
than y, it is strictly required that xγ∈ L(y). Chambers et al. [Cham 96] have proposed the
input directional distance function: ~D(x, y, g) = sup β : x− βg ∈ L(y), which is basi-
cally the same as the Luenberger benefit function [Luen 92], when translated into produc-
tion theory. Under weak input disposability because x ∈ L(y), if and only if, d(x, y) = 1,
we can say ~Dx, y, g) = sup β : d(x− βg, y) ≥ 1, which clearly shows the relation be-
tween benefit and input distance function. In the case of choosing g = x, it is proven
that ~D(x, y, x) = 1 − 1d(x,y
and d(x, y) = 1~D(0,y,−x)
. Later Fare and Grosskopf [Fare 00]
extended the function to expand output and contract input simultaneously, so that the
dual would be the profit function ~D(x, y,−gx, gy) = sup β : x− βgx ∈ L(y + βgy).
3.17 Range Directional Model and Inverse Range
Directional Model
Portela et al. have introduced these models to initially deal with the negative data
directly [Port 04]. They have built on the directional distance function and with defining
a new range along with pretreatment of the data make it possible to deal with negative
data and set targets which are easier to achieve (closer to the DMU in contrast with the
farthest (largest slack). RDM defines the direction of improvement as the path towards
the super champion (ideal point) which is the same for all and the best of best (it might
Chapter 3. Literature review of non-oriented models 60
not even exist). First, we look at their RDM model:
max βk, (3.29a)
s.t.
n∑j=1
λj · xij ≤ xik − βk Rik i = 1, ..,m (3.29b)
n∑j=1
λj · yrj ≥ yrk + βk Rrk r = 1, .., s (3.29c)
n∑j=1
λj = 1 (3.29d)
Rrk = maxjyrj − yrk r = 1..s (3.29e)
Rik = xik −minjxij i = 1..m (3.29f)
λj, s±i ≥ 0. (3.29g)
The above model is non-oriented and input contraction and output expansion are looked
at simultaneously. Setting either of the ranges, Rik or Rrk, to zero would make the above
output or input oriented, respectively. The range is an upper bound on the slacks for
each variable. The authors have proved RDM is also translation and units invariant.
β is an inefficiency score, but β does not encapsulate all sources of inefficiency and
some inputs or outputs might have nonzero slacks at the optimal value for β. 1 − β is
considered as the RDM efficiency score which is bounded by one. However, the direction
towards the production frontier is biased towards the factor with the highest potential
for improvement (as in most slack-based measures).
To make targets closer to the unit (give priority of improvement to those factors of
a unit which are closer to the best practice) IRDM is suggested, which uses the inverse
range. Whenever the range is zero, the division by zero is avoided and the inverse is
replaced by zero, which is reasonable, since the range zero means the unit is already
Chapter 3. Literature review of non-oriented models 61
efficient on that front. IRDM is defined through the following LP:
max βk, (3.30a)
s.t.
n∑j=1
λj · xij ≤ xik −βkRik
i = 1, ..,m (3.30b)
n∑j=1
λj · yrj ≥ yrk +βkRrk
r = 1, .., s (3.30c)
n∑j=1
λj = 1 (3.30d)
Rrk = maxjyrj − yrk r = 1, .., s (3.30e)
Rik = xik −minjxij i = 1, ..,m (3.30f)
λj, s±i ≥ 0. (3.30g)
The above is translation invariant but not units invariant. The authors suggested a
pretreatment of the data (normalization) to achieve units invariance artificially. The
intention is to divide every output by the largest output and the same for the input.
Please note that the ranges need to be re-evaluated after this normalization stage. IRDM
efficiency score measures the distance from an observed point to a target point with
reference to some ideal point. But contrary to the RDM, this time ideal point changes
for every DMU and as a result, interpreting the efficiency score or comparing scores or
rankings is not an option here. This model is merely used for target setting.
Asmild and Pastor [Asmi 10] extended the RDM (by adding a second phase) to ac-
count for Pareto efficiency. The projection by RDM might be to the weakly efficient
frontier and also benchmarks have non-directional slacks. The second phase is a weighted
additive model with the aim of detecting the existence of those slacks:
maxm∑i=1
τ−ikRik
+s∑r=1
τ+rk
Rrk
, (3.31a)
s.t.
Chapter 3. Literature review of non-oriented models 62
n∑j=1
λj · xij + τ−ik = xik − β∗k .Rik i = 1, ..,m (3.31b)
n∑j=1
λj · yrj − τ+rk = yrk + β∗k .Rrk r = 1, .., s (3.31c)
n∑j=1
λj = 1 (3.31d)
Rrk = maxjyrj − yrk r = 1, .., s (3.31e)
Rik = xik −minjxij i = 1, ..,m (3.31f)
λj, s±i ≥ 0. (3.31g)
The measure then is defined ex post facto as 1m+s
(∑mi=1
R−∗ikRik
+∑s
r=1
R+∗rk
Rrk
)= (1− β∗k)− 1
m+s
∑mi=1
τ−∗ikRik
+∑s
r=1
τ+∗rkRrk
In this way the contribution of every input and output to the measure is clear.
3.18 Modified Slack-based Measure
Sharp et al. have built their MSBM model [Shar 07] by using the SBM model by
Tone[Tone 01] as the base, and incorporating Portela [Port 04] ranges in order to en-
able the SBM to deal with natural negative data. [Shar 07] Natural negative variable
is a variable with a meaningful zero (like most undesirable outputs). The model over-
comes two drawbacks of SBM with negative data: it is translation invariant and does
not generate negative inefficiency scores. Their model is based on the assumption that at
least one positive input and one positive output should exist. To avoid dividing by zero
whenever the corresponding ranges are zero, the corresponding term is dropped from the
objective. Here is what they suggest:
minλ,s±i
ρ =1− 1
m
∑mi=1
wis−i
P−i0
1 + 1s
∑sr=1
vrs+r
P+r0
, (3.32a)
s.t.
Chapter 3. Literature review of non-oriented models 63
n∑j=1
λj · xij + s−i = xi0 i = 1, ..,m (3.32b)
n∑j=1
λj · yrj − s+r = yr0 r = 1, .., s (3.32c)
n∑j=1
λj = 1 (3.32d)
m∑i=1
wi = 1 (3.32e)
s∑r=1
vr = 1 (3.32f)
P−i0 = xi0 −minjxij i = 1, ..,m (3.32g)
P+r0 = max
jyrj − yr0 r = 1, .., s (3.32h)
λj, s±i , wi, vr ≥ 0. (3.32i)
They have proved the measure ρ is between zero and one and the model is both units
and translation invariant. The model can be linearized in the same fashion as SBM:
min τ = t−m∑i=1
wis−i
P−i0(3.33a)
s.t.
X Λ + S− = t.x0 (3.33b)
Y Λ− S+ = t.y0 (3.33c)
t+s∑r=1
vrs+r
Pr0= 1 (3.33d)
m∑i=1
wi = 1 (3.33e)
s∑r=1
vr = 1 (3.33f)
Λ, S±, t ≥ 0 (3.33g)
At the optimal solution, we have ρ∗ = τ ∗ , λ∗ = Λ∗
t∗, s−
∗= S−
∗
t∗and s+∗ = S+∗
t∗.
The MSBM score cannot be greater than the RDM score as they have shown through
Chapter 3. Literature review of non-oriented models 64
an example. The model allows for slack weight alteration, depending upon strategic or
managerial preferences.
3.19 Directional distance functions and slack-based
measures of efficiency
Fare and Grosskopt [Fare 10a, Fare 10b] also worked on Tone’s SBM model and proposed
the following which does not require any adjustments with zero components in input or
output, while it is translation and units invariance.
α0 = max β1 + ...+ βm + γ1 + ...+ γr, (3.34a)
s.t.
n∑j=1
λj · xij ≤ xi0 − βiIi i = 1, ..,m (3.34b)
n∑j=1
λj · yrj ≥ yr0 + γrIr r = 1, .., s (3.34c)
n∑j=1
λj = 1 (3.34d)
λj, βi, γr ≥ 0∀j, i, r. (3.34e)
(3.34f)
Ii and Ir are directional vectors of one size corresponding to each input and output
unit of measurement. This choice of vectors is necessary to keep the units invariant
characteristic, so if input two is measured in kilograms, I2 is one kilogram and if it
changes to grams, then I2 is one gram. Please note that alpha0 = 0 if, and only if, all
slacks are zero. It acts like an efficiency score of one in SBM model.
Chapter 3. Literature review of non-oriented models 65
3.20 Universal model for ranking
Paterson’s extended non-oriented model is aimed at measuring super efficiency and rank-
ing the units [Pate 00], which is not our focus at all, however, because the idea on how
to treat our scores was formed reading his work, we briefly mention it here. Two existing
methods for ranking units in DEA are: calculating the Malmquist index and measuring
super efficiency [Ande 93]. For the latter, oriented models result in some shortcomings,
and that is why Paterson proposed two non-oriented models to measure super efficiency
the right way. We skip the radial model he proposed but the second one is an additive
model used in the Andersen and Petersen procedure, i.e. each DMU is evaluated consid-
ering the data set, excluding the DMU itself. This implies the DMU under study might
be inside or outside the convex hull of others. Since this information is not available a
priori, the following is solved under two different situations:
φi, φo ≥ 0 for points inside the hull, and
φi, φo ≤ 0 for points outside the hull.
If the first has a feasible answer, the point is inside and if not, the second set of constraints
is applied to 3.35 as presented below:
maxλ,φi,φo
m∑i=1
1
σiφi · xik +
s∑o=1
1
σoφo · yok, (3.35a)
s.t.
n−1∑j=1
λj · xij ≤ (1− φi) · xik ∀i = 1, ..,m (3.35b)
n−1∑j=1
λj · yoj ≥ (1 + φo) · yok ∀o = 1, .., s (3.35c)
n∑j=1
λj = 1 (3.35d)
λj ≥ 0. (3.35e)
Chapter 3. Literature review of non-oriented models 66
The σi and σo are standard deviations of every input and output considering all the n
points. It is easy to show that the above formulation is units invariant. The criticism of
this formulation is why the actual inputs and outputs are included in the objective, which
brings the actual size of the unit into account and might give big units an advantage,
something we prefer to avoid in DEA. The formulation is similar to the Russell graph
measure, however, instead of the summation of scalars, the actual radial reduction of
inputs and expansion of outputs is maximized. The novelty of this approach from our
point of view is how Paterson extended this and normalized the scores. He added a worst
possible DMU to the data set, this DMU having the maximum of inputs and producing
the minimum of outputs (every output). He calculated the super efficiency for the worst
unit and then normalized the scores of the rest, with respect to the worst player.
3.21 Remarks
We have reviewed about 20 of the existing models (almost all of them that we have come
across), yet none of them properly supported the ratio variables. In addition, none of
them with a linear computational complexity had the desired characteristics of units and
translation invariance in addition to providing an efficiency score between zero and one.
However, the method for transforming scores and how to normalize or achieve units or
translation invariant properties all represent valuable information, which we use when
developing our own model. The methodology pertinent to this literature review will
follow in Chapter 5.
Chapter 4
Literature review of approximation
models
As mentioned in the first chapter, this chapter forms the second part of the literature
review pertinent to this thesis. The methodology will follow in Chapter 6. The literature
around the non-linearity problem arises in the case of BCC model with ratio variables
on the side of orientation being in fact, scarce. The only paper that discusses the issue
directly, where was mentioned in Chapter 1, dates back to 2009 and was indeed the
motivation for this thesis [Emro 09].
The LP formulation discussed in Chapter 2 is a tool to mathematically solve the DEA
concept. The DEA concept remains the same for our case with ratio variables, however
the well-known tool, LP, to estimate the boundaries of the PPS, based on the observed
sample, cannot be employed. We devise this problem to how to find an estimate of the
true production frontier, with a sample of n observations (DMUs). There is a vast and
rich literature focusing on the goodness of the estimate of the frontier in DEA and the
statistical inferences about it [Knei 03, Dyso 10].
We have reviewed the literature on estimating the true frontier in DEA as you will
find below. The authors all pursue a goal like ours: estimating the unobserved boundaries
67
Chapter 4. Literature review of approximation models 68
of the PPS based on an observed sample of DMUs. The sources of uncertainty studied
in DEA is embedded in the nature of choosing, defining, judging, and measuring the
variables. A very good study by Dyson and Shale details various uncertainties in real
world situations about the true efficient frontiers, and the methods to deal with them
[Dyso 10]. Although for reasons that will become apparent at the end, we produce a
heuristic of our own, the literature has been helpful to shape our thoughts and give us
insight into the techniques employed mainly by statisticians. This is relevant to our case
as we also deal with problems in which a few samples of the PPS are available, and we
would like to know more about the actual population.
4.1 Bootstrapping and DEA
Bootstrapping is a well-established re-sampling technique to approximate the distribution
of a random variable in order to estimate a value such as the mean [Efro 79, Efro 82,
Efro 94]. It entails three basic steps:
• Construct the sample probability distribution Θ, giving each observation the chance
of 1/n;
• draw a random sample of size n with replacement and call this the bootstrap sample;
and
• approximate the bootstrap distribution of the variable of interest induced by the
random mechanism above, with Θ fixed.
The difficult part of the bootstrap procedure is the actual calculation of the bootstrap
distribution. Although in a few cases direct theoretical calculation is possible, Monte
Carlo approximation is often used. Repeated resampling of size n from the observed
data is drawn with replacement and the corresponding values of the variable of interest
Chapter 4. Literature review of approximation models 69
is recorded. The pattern of these values is an approximation of the actual bootstrap
distribution. This is the method used in DEA.
In our case the variable of interest is an estimate of the true frontier. We have a
sample of the PPS, our observed DMUs, and would like to estimate the boundary of
PPS.
The application of bootstrapping in the context of Data Envelopment analysis dates
back to 1995 [Gsta 95] and were later developed by Simar and Wilson [Sima 98, Sima 00,
Sima 99a, Sima 99c, Sima 99b]. They claim DEA models measure efficiency relative to a
non-parametric maximum likelihood estimate of an unobserved true frontier, conditional
on observed data resulting from an underlying and usually unknown data generating pro-
cess. They argue that because efficiency is measured relative to an estimate of the true
frontier, estimates of efficiency from DEA models are subject to uncertainty due to sam-
pling variation. Others have also extended and modified Simar and Wilson bootstrapping
approach such as [Loth 99, Tzio 12, Ferr 97, Ferr 99]. The idea has been applied to the
real problems like in [Alex 10, Sadj 10] to name a few. A full review of the methods are
presented in [Sima 08].
The bootstrap algorithm has the following steps:
1. Calculate the original efficiency estimate using the LP presented in Chapter 2 and
transform the observed input- output vectors as the following
θi = minθ,λ
θ : yi ≤ Y · λ, θ · xi ≥ X · λ,
n∑i=1
λi = 1, λi ≥ 0
(4.1)
(xif , yi
)=(θi · xi, yi
)(4.2)
2. Resample independently with replacement n efficiency scores form the n original
estimates,θi
. Let δ∗i , i = 1, . . . , n, denote the resampled efficiencies. In some
methods a noise is added to the resampled efficiencies followed by a correction
mechanism. A so-called smoothing procedure championed by Simar and Wilson
Chapter 4. Literature review of approximation models 70
is to increase consistency of the bootstrap estimator. For the detailed discussion
on this, please refer to [Sima 00]. Independent of the procedures we will have n
randomly selected efficiency scores to generate bootstrap pseudo-data.
3. Let the bootstrap pseudo-data be given by
((x∗i , y
∗i ) = xi
f/δ∗i , yi
). (4.3)
4. Estimate the bootstrap efficiencies using the psuedo-data and the linear program
as in step 1 by
θ∗bi = minθ,λ
θ : yi ≤ Y · λ, θ · x∗i ≥ X∗ · λ,
n∑i=1
λi = 1, λi ≥ 0
. (4.4)
Again, here, is a debate between scholars as to whether the efficiency estimate of
the ith DMU should be evaluated as the efficiency of the original input or the
pseudo input relative to the boundary of the convex and free-disposal hull of the
pseudo-observations. Lothgren claims the latter eliminates the complex smoothing
procedure by Simar and Wilson [Loth 98].
5. Repeat steps 2-4 B times to create B bootstrap estimates for each DMUs efficiency.
These will estimate the distribution of the efficiency score and the mean which is
of interest, as well as the confidence Intervals. 95% Confidence Interval is between
2.5th and 97.5th percentile. B usually is chosen as 1000 after the recommendation
by Efron and Tibshirani [Efro 94].
There is another variation of bootstrapping used in DEA, which is called bootstrap with
sub-sampling. This method essentially allows a smaller sample size m < n from the
original n observations to be used for bootstrapping. Similar to the original bootstrap,
repeated samples are drawn uniformly, independently and with replacement [Knei 03].
Through an algorithm detailed in their work the procedure creates a consistent estimate
similar the smoothed version.
Chapter 4. Literature review of approximation models 71
The main motivation to employ bootstrapping in Simar and Wilson work was to ac-
count for the measurement error in inputs and outputs. They wanted to study statistical
properties of the non-parametric frontier and were interested in the confidence intervals.
In the presence of ratio variables, because of the non-linear formulation for the efficiency
estimates, we have no estimate of the true frontier to begin with. With the knowledge
of the case under the study, for instance bank branch performance, we can hypothesize
a sensible distribution for the efficiency scores but without an operational mathematical
program we have no way of estimating original frontier inputs for each branch. The data
generating process assumed by most of these studies is based on a random deviation from
the input frontier for a given output, hence we need a sample of efficiency scores and
the original outputs and the estimate of the efficient input level which is not available
to us. Nevertheless we use the idea and define a data generating process for PPS based
on re-sampling with replacement from the observed DMUs. Because we are interested
in the samples, perceived to be closer to the true frontier, the sub-sampling idea is of
interest and the next logical step is to study sampling techniques.
4.2 Sampling techniques
The Monte Carlo method was first publicly introduced in 1949 to solve integrals that were
hard to solve [Metr 49]. It was quickly developed as applications in physics, business,
computing, finance, and engineering adopted it. Every Monte Carlo calculation such
as the one used in the bootstrapping, as seen in the previous section, requires repeated
sampling of random events that, in some way, represents or defines the phenomenon of
interest. This repeated sampling is a way of simulating the behaviour of the phenomenon.
In this way, the samples or simulation can be used to make approximations of properties
of interest. For well-known distributions standard uniform sampling methods exist which
have been incorporated in most computer packages. For real life problems, where the
Chapter 4. Literature review of approximation models 72
variables representing the event, do not fit into the standard distributions, other tech-
niques have been developed to decrease the chance of misrepresenting the population.
Popular ones are importance sampling, rejection sampling (Stan Ulam and John Von
Neumann), Metropolis sampling [Metr 53], Metropolis-Hasting sampling [Hast 70] and
Gibbs sampling [Gema 84]. Among these rejection sampling is of a practical interest to
us, as explained next.
Rejection sampling
The idea behind the rejection sampling is to use the standard techniques to draw samples
from an envelope distribution that lies over the required distribution. The goal is by
accepting some draws and rejecting others the resulting sample mimics the population of
interest. We use the idea later in Chapter 6, to reject samples that have a small chance
of being close to the efficient frontier.
Markov Chain Monte Carlo Methods (MCMC)
The Monte Carlo simulation algorithm simulates independent random values from the
probability distribution of interest. But, in MCMC algorithms, there is a dependence
between simulated values. MCMC algorithm works by small random jumps to explore
the distribution of interest.
Metropolis is one of the MCMC algorithms. This algorithm constructs a Markov
chain by proposing small probabilistic symmetric jumps centered on the current state of
the chain. These are either accepted or rejected according to some specified probability,
and in the case of rejection the next step in the chain equals the previous step [Metr 53].
Hasting later generalized the algorithm so the jumps in the Metropolis-Hasting algorithm
do not have to be symmetric and indeed the probability of acceptance of one of these
new values is also different [Hast 70]. Gibbs sampling is a special case of the Metropolis-
Hasting approach, that uses conditional probabilities for the steps.
Chapter 4. Literature review of approximation models 73
There are examples in the literature that show the above mentioned techniques used
alongside DEA. For instance Gibbs sampling has been used in stochastic frontier analysis
together with DEA to improve estimates of the efficiency [Tsio 03]. For a two output and
one input case under VRS assumptions, Gstach used DEA to generate a frontier estimate
and devised a DGP based on output proportions. The author used Metropolis-Hasting
method to statistically define the unobserved output targets given an input and output
mix [Gsta 03]. However in none of them the Monte Carlo method was used to estimate
the DEA efficiency estimates as we intend to do.
4.3 Summary
Although bootstrap and Monte Carlo techniques are both used to simulate a statistical
measure by repetitive sampling, they are different in many ways. The main difference is
that in bootstrap there is a readily available sample of size n from the population, and
re-sampling is based on that, whereas in Monte Carlo, we have no samples to begin with,
and we try to construct a proper data generating mechanism to provide us with a sample
of size n to represent the population.
Monte Carlo is mostly used in designing experiments, where the production function,
inputs and outputs and true efficiency scores are known. The goal is to compare different
models, according to the efficiency scores they produce and how those scores deviate from
the true frontier. The general form of production function is Y = f(X)+error, where Y
and X are random variables representing output and input vectors. The production func-
tion needs a-priori information about the underlying technology (CRS, VRS ...). Errors
consist of noise and technical inefficiency. Here x, realization of X, is drawn randomly and
corresponding y is derived according to the formula. Banker et al. [Bank 93, Bank 87]
used Monte Carlo to simulate decision making units to compare DEA with some statisti-
cal measures. The same is done by [Gong 92] to compare DEA and stochastic frontiers.
Chapter 4. Literature review of approximation models 74
For our work, however, we are interested in the frontier where, as with Monte Carlo,
we have no samples to begin with. We have a few samples of the PPS but we do not have
a direct mathematical formula to generate the frontier. However we have a sample of the
PPS (the observed DMUs) as in the bootstrap and we can sub-sample from it to explore
the PPS. We can estimate PPS partially out of those samples and then, via an LP, find
close to frontier estimates. The details of this appear in the methodology in Chapter 6.
Chapter 5
Methodology:Proposed
Non-oriented Model
In Chapter 3 we reviewed the DEA literature around the existing non-oriented mod-
els, and in Chapter 1 we delved into the difficulties and encountered the ratio variable
problems [Holl 03, Emro 09, Siga 09] due to DEA nonlinear formulation. Here, we are
seeking a model that can be transformed into a form that is amendable to be solved by
linear programming or, at least, fractional linear programming when the ratio variables
are mixed with non-ratio variables. This is accomplished in two parts. The first part is
to know how to redefine the DEA model having the use of ratios in mind which avoids
the incorrect calculation of the PPS. The second is to build a non-oriented model that
satisfies a list of desirable properties, which we compiled from the literature. More im-
portantly, our model should be computationally feasible when the ideas from the first
part are applied to it.
5.1 Required adjustments to the basics of DEA
Let us first try to rework the DEA going back to the original concept by Farrell [Farr 57].
We break the DEA procedure into three parts: first, we define the production possibility
75
Chapter 5. Methodology:Proposed Non-oriented Model 76
set (PPS), to include all the feasible units including observed units as well as interpolation
of the units; second, we identify the efficient units, and third, measure the changes the
inefficient units need to fulfill in order to become efficient.
5.1.1 Defining PPS
DEA is very data oriented. This means that the perceived potential of a business wholly
depends on the observed data. There are various assumptions/rules that allow inter-
polation/extrapolation of the possible units based on the observed points. This means
that no matter what the assumption is, once the sampled/observed data varies, the PPS
may change and, as a result, the sensitivity of DEA to the data at hand is significant.
A remedy for this is comprehensively sampled data from good performers, average per-
formers and weak performers, based on educated guesses or the recommendation by the
management. For example, in the banking industry having the data from branches well
spread over different regions (provinces, urban, metropolitan, and rural) would mitigate
the sensitivity to the data. It is worth noting that adding new data (a new unit) can
only decrease the efficiency score. Popular assumptions, or in other words, the accepted
rules, based on which of the points of PPS are interpolated/extrapolated from the ob-
served points, are listed below, with the choice of an appropriate model depending on
the knowledge domain regarding the industry.
5.1.2 Disposability
If a unit with input X and output Y is observed, it means that any unit worse than
this, with inputs larger than/equal to X and outputs smaller than/equal to Y can be
realized. This is a fundamental assumption to form the PPS out of the available data.
Free disposability or strong disposability does not impose any rules on “being smaller”
or “larger”. Weak disposability defines a rule, for example when (X, Y ) is observed,
(X,αY ) can be realized if 0 ≤ α ≤ 1 [Rolf 89]. A combination of weak and strong
Chapter 5. Methodology:Proposed Non-oriented Model 77
disposability might be applied, depending on how the inputs/outputs are defined, the
nature of those inputs/outputs, and to what degree various outputs are linked together.
For example in an energy production sector, for the same input, the harmful emissions
cannot be reduced without reducing the actual electricity generated or introducing new
technology.
5.1.3 Convexity
If any two units are attainable, any unit representing the average weighted sum of those is
feasible. The convexity assumption has its roots in economic theory. Although it sounds
practical, some have challenged the limitation that convexity imposes in attaining PPS.
Deprins et al. [Depr 84] claim that it is not a good fit for the data at hand to generate
better performers that eventually reduce the efficiency of the observed points. They point
out that it is,instead, good for creating future projections [Tulk 93]. There are other valid
concerns about convexity in a number of industries, specifically, the production. Cherchye
et al. have documented this issue quite well [Cher 99].
Here are some of the limitations that convexity axiom has. Convexity requires pro-
duction activities (it usually translates to input output) to be divisible [Farr 59] and
this is not certain in every business. For example, some raw materials come in batches
and 1.2 batches is infeasible or you cannot have 1.5 workers for labour. But, in general,
indivisibility is not a huge issue because variables are divisible to some extent and this
problem can be solved by approximation. The other issue is economies of scale or in-
creasing marginal productivity. Let us say increasing inputs twofold will result in output
levels being more than doubled, since that happens at a certain level, like X > 3.5T , the
weighted average of the two units if one enjoys the economies of scale (4T, 6Y ) and one
does not (2T, 2Y ) might be something infeasible like (3T, 4Y ). A simple remedy would
be to partition the data into sets with similar characteristics, like units with X < 3.5T
in one group and X > 3.5T in another. The same argument can be used to show the
Chapter 5. Methodology:Proposed Non-oriented Model 78
diseconomies of scope, such as selling as a bundle, which would cause a problem but
treating the bundle as a product, rather than breaking it down would, at least, partially
solve the problem.
The aforementioned problems with economies of scale, divisibility, and diseconomies
of scope [Bous 09], have little relevance to our problem of interest. Therefore we use
convexity assumption in our modeling. For the research dealing with risk aversion, the
convexity assumption should be used with caution or avoided entirely. While generating
the PPS, we also need to consider customized rules/limitations based on practicality,
experts’ opinion and so on.
5.1.4 Identifying the efficient units
Contrary to the general perception that identifying efficient units depends on how one de-
fines efficiency, we argue that efficient units depend on how one defines PPS in DEA model
and has less to do with the type of efficiency. Basically, efficient units are sitting on the
frontier of the PPS. We might be looking only at specific parts of this frontier, or frontier
facets [Ali 93, Oles 03, Apar 07, Frei 99], depending on the type of efficiency/orientation.
All the frontier units are radially efficient units either in terms of input or output, whereas
slack-based efficient units are a subset of the radially efficient units and might not include
all of them. Frontier units do not change for a specific PPS; the efficiency score of the
inefficient ones might change though, depending on how we measure the distance (radial,
slack, input/output or orientation) to the efficient frontier. In the next part we look at
the ways the distance/path to frontier is defined.
5.1.5 Calculating the relative efficiency score
Relative efficiency scores are widely known as scalars, which determines how much a unit
must decrease all its inputs or increase all its outputs to become technically efficient. As
explained in Chapter 2, a DMU is pareto-efficient if it bears no slacks/shortfalls in any
Chapter 5. Methodology:Proposed Non-oriented Model 79
of its inputs/outputs. For a Pareto efficient unit, any reduction in any of the inputs, or
an increase in any of the outputs, would make the unit infeasible unless, we, respectively
increase some other inputs or decrease some other outputs. Note that the amount of
substitution and trade-off is not part of the decision-making process here. As a result,
radial efficiency scores do not reflect the possibility of input or output substitution, or
generally removing slacks. The CCR model of Charnes et al. [Char 78] and the BCC
model of Banker et al. [Bank 84] and even the FDH model [Depr 84, Tulk 93] calculate
radial efficiency. Cooper et al. and Tone later modified the BCC and CCR models by
adding a second stage to examine if the target (a unit on the frontier) can be further
improved by eliminating slacks [Coop 04, Tone 99].
Radial efficiency has its own merits such as good “reference” selection. Close targets
require minimal changes to an inefficient unit to become an efficient [Conc 03]. Since
the goal is to project the unit to a part of the frontier that requires minimum change
in inputs/outputs, the unit is compared with the ones in its league. In other words, we
evaluate the unit in the best possible light. Radial efficiency is sometimes called the
Farrell-Debreu efficiency measure [Coop 99a].
Non-radial models first appeared in Fare and Lovell’s work in 1978 [Fare 78]. The
model was designed to reduce slacks in input or output, but not both, by individual
scalars for every input or output instead of the same scalar for all, as in CCR. They later
combined the two and defined the Russell graph measure [Fare 85]. Then, Charnes et
al. [Char 85] produced the additive model with the flexibility to change both inputs and
outputs and the goal of eliminating slacks as much as possible. Others also worked on the
non-radial efficiency measures such as Zieschange [Zies 84], who created a hybrid mea-
sure of Farrell and Russell’s input efficiency, and Green et al. [Gree 97], who suggested
calculating efficiency as the ratio of current levels to the optimum ones and adding them
up. He, in fact, created a nonlinear additive model with a meaningful bounded measure.
Thanassoulis and Dyson [Than 92] made a hybrid model of the Russell and additive
Chapter 5. Methodology:Proposed Non-oriented Model 80
models, forcing the preferred variables to have zero slacks while other variables’ slack is
reduced as much as possible. The score, however, does not convey an operational mean-
ing and the method is mainly used for target setting. The selection of weights is also an
issue and fully subjective. Ruggiero et al. [Rugg 98] came up with a weighted Russell
measure, in which the performance depends on the right choice of weights, and if the
relative weights are biased, distortions might be introduced. If there is one output, ordi-
nary least square regression can be used and to choose weights for multiple output cases,
the canonical regression analysis is the optimum method. To avoid biases, it is recom-
mended that the canonical regression is performed on the group of Farrell efficient units.
Other models include: the model for imprecise data by Cooper et al. [Coop 99b], models
for dealing with congestion in [Fare 85], [Coop 01a], [Coop 01b], [Broc 98], Cherchye et
al. [Cher 01], and a model for environmental performance when undesirable outputs are
present by Zhou et al. [Zhou 07].
With the prior knowledge that radial models do not work at all and cannot be lin-
earized when ratio and non-ratio variables are mixed, we focus on non-radial models to
build a desired non-oriented model.
5.2 Building the right measure of Efficiency
When inputs and/or outputs are in the form of ratios, conventional linear programming
methods may fail to build the correct frontier and, as a result, the scores and projections
are distorted. Recall that the main concern about using ratios as input or output variables
in the context of the conventional DEA is due to the fact that DEA estimates the PPS out
of the n available data points. Conventional DEA identifies each DMUi by its production
process (xi, yi) and works with a linear combination of (xi, yi)s. As long as xi and yi do
not contain any ratios, working with inputs and outputs translates exactly into working
with DMUs. However, merging inputs and outputs is not equivalent to merging DMUs
Chapter 5. Methodology:Proposed Non-oriented Model 81
when ratios are involved. For example, the composite output, Yr is given by:
[Yr =
N
D
]DMUcomposite
= (λ1%)
[Yr1 =
N1
D1
]+ . . .+ (λn%)
[Yrn =
Nn
Dn
], (5.1)
which is not equal to the output of a composite DMU, Y ∗r in 5.2 given by:[Y∗
r=
N
D
]DMUcomposite
=(λ1%)(N1) + . . .+ (λn%)(Nn)
(λ1%)(D1) + . . .+ (λn%)(Dn). (5.2)
Hollingsworth et al. [Holl 03] believed that the BCC formulation is the appropriate
model when ratios are involved because: Y ∗r = Yr ⇒∑
i λi = 1. However, as we proved
in [Siga 09], for the radial DEA approach to be valid when using ratios, the BCC model
is a necessary condition, but not a “sufficient” one [Siga 09]. Emrouznejad [Emro 09]
examined the problem and suggested two models. The first breaks down ratios to their
numerator and denominator parts and treats one part as input and the other as output.
The second one is based on the idea of working with DMUs rather than their production
processes. We will use the latter to build our models. We do not go through the details
of Emrouznejad’s models because they have been covered in [Siga 09]. In [Siga 09], we
added a second stage to the Emrouznejad second model to eliminate any remaining slacks
after the radial projection onto the frontier. Success with this case has inspired us to
build a model that seeks targets with zero slacks.
5.2.1 Proposed non-oriented model
This research is directed toward non-oriented measures of efficiency in situations where
the use of ratio form in elements of inputs and outputs is selected due to managerial
judgments. We want our model to capture all inefficiencies (zero slacks) and our measure
to incorporate all the identified inefficiencies in the form of a single real number, which, by
itself, requires the whole process to be operationalized by the use of linear programming.
As we have seen in the literature review chapter 3, the development of non-oriented
models that aim to assess performance and decrease inputs while increasing outputs have
Chapter 5. Methodology:Proposed Non-oriented Model 82
been a topic of interest in the literature since 1992, with the last paper being in 2010.
Among the scholars working on this topic, Pastor has contributed the most, in general. A
quick review is presented in Table 5.1. We would like to restate the definitions to clarify
the types of improvements which non-oriented models suggest. There exist two constructs
in every model: input inefficiency and output inefficiency. For each of the constructs, the
potential for improvement is measured. The “How to” measure depends on the technology
and methodology the analyst has in mind. It could be either overall radially: scaling
up/down all the components of input or output, simultaneously or scaling up/down
every component of the input/output, while keeping others constant and aggregating
them either by adding them up or by multiplication, or focusing on the distance from
the frontier and deciding on the preference/importance of the distances, based on value
judgment, and again aggregating them either by summation or multiplication.
The next step usually involves aggregating the two constructs of input efficiency and
output efficiency, which could be simply done by adding them up or taking the ratio.
In this stage, the concern is to define this measure in a way that satisfies certain re-
quired/preferred properties [Coop 99a, Love 95a, Coop 11]. We have compiled a list of
properties we found in the literature. Hardly any existing models meet all the require-
ments. Here, we list the desired properties with a brief description to identify what,
for us, has been important and considered while developing our model. Mathematical
properties that we want our model and the resulted efficiency score, Ω, to satisfy are:
1. 0 ≤ Ω ≤ 1;
2. Ω = 1 only if the unit is fully efficient and Ω = 0 means fully inefficient;
3. Ω is dimensionless: invariant to change of units;
4. Ω is not affected by change of origin or shifts in data: invariant to translation
5. Ω is invariant to alternative optima;
Chapter 5. Methodology:Proposed Non-oriented Model 83
Table 5.1: Non-oriented models and their properties
YearNam
eScore Range
Meaning
TypeUnits
invariantTranslation invarariant
Zero inputs/outputs
Computational
Degree
1985Russell M
odel0‐1
No slack
non‐RadialY
NY/N
Non‐linear
fractional
2012Refined Russell M
odel0‐1
No slack
non‐RadialY
NY/Y
Non‐linear m
ulti step
1982Multiplicative M
odel (log efficiency)
0‐1 log efficient
Log non‐Radial
NN
N/N
log Linear multistep
1983Invariant M
ultiplicative Model
0‐1log efficient
Log non‐Radial
YN
N/N
log Linear multistp
1985Pareto efficient em
perical production function m
odel‐infinity‐0
No slack
non‐RadialN/Y
Y/NN/N
Linear
1995Global Efficiency m
easure0‐1
No slack
non‐RadialY
NN/N
Non‐linear/ Linear
1999Range A
djusted Measure
0‐1No slack
non‐RadialY
YN/N
Linear
1994Constant w
eighted additve model
0‐1No slack
non‐RadialN
NLinear
1995Norm
alized weighted A
dditive model
‐infinity‐0
No slack
non‐RadialY
YY/Y
Linear
1999Enhanced Russell M
easure0‐1
No slack
non‐RadialY
NN/N
fractional non‐linear /Linear
1996Directional D
istance Function radial eff
radial directional
YN
NLinear
1985Graph H
yperbolic measure
0‐1radial eff
radial efficiency
YN
N/N
non‐linear
2011Bounded A
djusted measure
0‐1No slack
non‐RadialY
N (yes in VRS
case)N/N
Linear2001
Slack Based measure
0‐1No slack
non‐RadialY
NY/Y
Linear
2004Range D
irectional Model
0‐1technical eff
non‐RadialY
YY/Y
Linear
2007Modified Slack based m
easure0‐1
No slack
non‐RadialY
YY/Y
Linear2010
Slack free RDM
0‐1No slack
non‐RadialY
YY/Y
Linear
Chapter 5. Methodology:Proposed Non-oriented Model 84
6. Ω is isotonic; and
7. Ω possesses discriminating power.
The first and second properties conform to the common practice in other disciplines
as well as main DEA literature, where 100% means fully efficient. The third property
is of great value and guarantees that the solution is not affected by change in units of
measurement. We had seen an example before on how the absence of this property could
lead to misjudgments. The fourth property is essential when it comes to dealing with
zero and negative data. It allows for adding a constant to the data in order to transform
the negative/nonzero elements into positive values and remain assured that this will not
affect the Ω. However, the reader should bear in mind that there exist models which do
not possess this property but can deal with negative or zero data in some other fashion.
Property five is required to maintain the independence of the efficiency score from the
reference set (benchmark), so the route to efficiency might be different but the amount
of overall inefficiency to address would not be affected by the choice. Property six is
saying that for every inefficient unit, improvement in every element, while holding the
rest fixed, will result in improvement in the efficiency score. We will not seek this strict
monotonicity for the efficient units though because Pareto efficiency, by definition, means
no further improvement is possible. Moreover, weak monotonicity at extreme points
ensures robustness [Coop 99a]. The last property is desired so that the model/measure
makes use of the full range between zero and one. In many cases, the density function of
the score is very sharp in a specific range, which means that the majority of scores are
pressed in that range which prevents proper discrimination. This can be checked using
kernel density estimation, as used in a very recent research paper [Chen 14]. A summary
of the models and to what extent they meet the desired characteristics is demonstrated
in Table 5.2.
The thought process we were going through in the search for the proper model is
depicted in Figure 5.2.1 With prior knowledge that radial models will not work at all in
Chapter 5. Methodology:Proposed Non-oriented Model 85
Table 5.2: Summary of models and desired properties a
Models
Properties
RG
M
RR
M
MU
M
IMU
M
AD
DM
GEM
RA
M
CW
AD
DM
NW
AD
DM
ERM
SBM
DD
F
DSB
MI
GH
M
BA
M
RD
M
IRD
M
SFRD
M
MSB
M
0 ≤ Ω ≤ 1 P P O O O P P P O P P P O P P P P P P
Ω = 1iff fully efficient
P P O O O P P P O P P O O O P O O P P
Ω = 0iff fullyinefficient
O O O O O P O O O O O O O P O O P P
Ω isUnits invariant
P P O P O P P O P P P P O P P P P P P
Ω isTranslationinvariant
O O O O P O P O P O O O P O P P O P P
Ω is stronglyisotonic
O P P P P P O P P P P O P O O P P O P
Ω is invariant toalternativeoptima
P P P P P P P P P P P P P P P P P P P
Ω possessesdiscriminatingpower
P P P O O O O P O P P P O P P O O P P
Computationaldegree
O OOP P P O P P P O P P P P P P P P P
aRussell Graph Model=RGM, Refined Russell Model=RRM, Multiplicative Model (log effi-ciency)=MUM, Invariant Multiplicative Model=IMUM, Additive model= ADDM, Global Efficiencymeasure=GEM, Range Adjusted Measure=RAM, Constant weighted additive model=CWADDM,Normalized weighted, Additive model=NWADDM, Enhanced Russell Measure=ERM, Slack-basedmeasure=SBM, Directional Distance Function=DDF, Directional, slack-based measure of ineffi-ciency=DSBMI, Graph Hyperbolic measure=GHM, Bounded Adjusted measure=BAM, Range Direc-tional Model=RDM, Inverse Range Directional Model=IRDM, Slack free RDM=SFRDM, ModifiedSlack-based measure=MSBM
Chapter 5. Methodology:Proposed Non-oriented Model 86
Centre for Management of Technology and Entrepreneurship
Non-oriented model creation map
7
Aggregation (min or max)
RatioSummation
Multiplication
Input minimizationFull radialItem radial
Non-radial(additive)multiplicative
Output maximizationFull radialItem radial
Non-radial(additive)multiplicative
Figure 5.1: The blue print to construct a non-oriented DEA model
our case, we focus on non-radial models. We will deal with n DMUs in technology T ,
with m inputs x ∈ Rm and s outputs y ∈ Rs. We assume all data is strictly positive
for now; we will relax this assumption later. The production possibility set (PPS) is
defined as: (x, y)|x can produce y. Production function P : Rm → Rs is shown by
P (x) = max y|(x, y) ∈ PPS. So, P (x) is the maximum attainable output from resource
level x. In the context of relative evaluation, in practice we study a bundle like (x0, y0)
and we can define P (x0) on a subset of PPS and look for maximum attainable output from
resource level x0 which produces at least y0. We also define the set L(y) = x|P (x) ≥ y,
which is the set of resources that can produce at least y. The isoqL(y) consists of x that
cannot be reduced radially or, in other words, one of the inputs is fully efficient (zero
slack), whereas the set effL(y) contains x which are fully efficient at every input, with
zero slack. It is clear that effL(y) ⊆ isoqL(y). These two sets are defined as follows:
isoqL(y) = x|x ∈ L(y), θx /∈ L(y), θ ∈ [0, 1) ,
effL(y) = x|x ∈ L(y),∀x′ ≤ x, x′ /∈ L(y) .
From here, we could define various functions to capture the input inefficiency and output
inefficiency separately and aggregate them appropriately in order to satisfy the desired
Chapter 5. Methodology:Proposed Non-oriented Model 87
attributes mentioned earlier. Then it comes the important question of operationalization
of the idea, which enables us to compute the measure.
5.2.2 Model in the making
Going back to our Figure 5.2.1, we go through every possible known model and evaluate
if it is a good fit to our case, which is essentially investigating if ratios can mix with non-
ratio (normal) variables, where the convexity axiom holds, in the formulation without
any distortion in the frontier. As mentioned in Chapter 1, with reference to [Cook 14],
by ratio variable we mean a variable composed of two other variables, specific to the
DMU. Therefore the change of units/scaling, e.g. various form of reporting grades in
school subjects, is not seen as a ratio variable and any standard DEA model which is
units invariant would give valid results. We are considering the cases where the convexity
assumption is applicable and the components of ratios are known. In order for this to
happen, we ask the following research questions:
• How can we formulate the convex hull of PPS?
• Can it be operationalized via linear programming?
• To what extent does it meet our preferred properties?
Individual Scalars
Let us start with individual scalars, as in the Russell measure [Fare 85], in which the
efficiency function is given by aggregating the following constructs:
ΩI(x, y) = min∑
θi|(θ1x1k, ..., θmxmk) ∈ L(y), 0 < θi ≤ 1
ΩO(x, y) = max∑
φi|x ∈ L(y1kφ1, ..., yskφs), φ ≥ 1
Before deciding on the measure formulation, we need to clarify what (θ1x1k, ..., θmxmk) ∈
L(y1kφ1, ..., yskφs) means for the ratio variables. Given the observed data, we can con-
Chapter 5. Methodology:Proposed Non-oriented Model 88
struct the empirical PPS, which is the convex hull of observed units. The collection of
the constrains that defines the empirical PPS are given by:
∑nj=1 λj · nxij∑nj=1 λj · dxij
≤ θixik, xik = nxik/dxik ∀i = 1, .., q;
n∑j=1
λj · xij ≤ θixik i = q + 1, ..,m;
n∑j=1
λj · yij ≥ φiyik ∀i = r + 1, .., s;∑nj=1 λj · nyij∑nj=1 λj · dyij
≥ φiyik, yik = nyik/dyik ∀i = 1, .., r;
n∑j=1
λj = 1;
λj ≥ 0.
We can define some possible measures such as:
1)ΩIm
ΩOs
, 2)ΩIm
+Ω−1O
s, and 3)
ΩI + Ω−1O
m + s
It is trivial but worth mentioning that in the process of aggregating constructs, the
constraints of each construct should hold for the others as well, since aggregating is not
done ex post facto but considers both constructs simultaneously. The above constraints
can be linearized by the following change of variables:
θiλj = ωij φiλj = χij
The first measure, although nonlinear, can theoretically be transformed into a fractional
linear program same as in [Past 99b]. The way we rewrite the formulation with the
change of variables has made this into something similar to an enhanced Russell measure,
so the ratio variables have been incorporated without adding to the complexity of the
Chapter 5. Methodology:Proposed Non-oriented Model 89
formulation as seen below:
min Ω =
∑mi=1 θim∑si=1 φis
, (5.3a)
n∑j=1
λj · nxij ≤n∑j=1
ωijdxij · xik ∀i = 1, .., q (5.3b)
n∑j=1
λj · xij ≤ θi.xik i = q + 1, ..,m (5.3c)
n∑j=1
λj · yij ≥ φi.yik ∀i = r + 1, .., s (5.3d)
n∑j=1
λj · nyij ≥ yik
n∑j=1
χij · dyij ∀i = 1, .., r (5.3e)
n∑j=1
ωij = θi ∀i = 1, .., q (5.3f)
n∑j=1
χij = φi ∀i = 1, .., r (5.3g)
n∑j=1
λj = 1 (5.3h)
λj ≥ 0. (5.3i)
Despite having successfully included ratio variables without distorting the PPS, the above
model does not meet all the desired properties we listed before. Model (5.3a) is similar to
the enhanced Russell model we discussed in Chapter 3, in terms of complexity and that
the objective function is not linear. Let us transform this into something linear using the
fractional linear programming technique. We introduce new variables as:
β =
∑si=1 φis
−1
, t−i = βθi, t+i = βφi;
µj = βλj, ωij = βωij, and χij = βχij.
Using the above, any optimal solution to the (5.4a), will give an optimal solution to 5.3a.
The LP formulation is given by:
min
∑mi=1 t
−i
m(5.4a)
Chapter 5. Methodology:Proposed Non-oriented Model 90
s.t.∑si=1 t
+i
s= 1 (5.4b)
n∑j=1
µj · nxij ≤n∑j=1
ωijdxij · xik ∀i = 1, .., q (5.4c)
n∑j=1
µj · xij ≤ t−i .xik i = q + 1, ..,m (5.4d)
n∑j=1
µj · yij ≥ t+i .yik ∀i = r + 1, .., s (5.4e)
n∑j=1
µj · nyij ≥ yik
n∑j=1
χij · dyij (5.4f)
n∑j=1
ωij = t−i ∀i = 1, .., q (5.4g)
n∑j=1
χij = t+i ∀i = 1, .., r (5.4h)
n∑j=1
µj = β (5.4i)
µj, t±i , β ≥ 0 (5.4j)
The above has been transformed into a linear program, of course at the cost of extra
variables and constraints. It still does not meet the translation invariance property, so
we go on building a different measure.
To try the next form of aggregation, we selectΩI+Ω−1
O
m+sas the measure with a note that
if any zeros exist in inputs or outputs, they are omitted and the denominator should be
the summation of absolute positive variables. This model is less biased than the other
choice, ΩIm
+Ω−1O
s, that averages input efficiency and output efficiency separately and might
assigns more importance to the side with fewer variables, because fewer inputs or outputs
will make the denominator smaller. For this one, we choose some other form of change of
variables, as noted below. In addition, we multiply yik by∑n
j=1 λj, which will not change
anything since at the optimal point, it is going to be one. The transformations are given
Chapter 5. Methodology:Proposed Non-oriented Model 91
by:
φ−1i = zi,
λjzi
= λjφi = χij
θiλj = ωij,n∑j=1
λj · yij ≥ yik.
∑nj=1 λj
zi∀i = r + 1, .., s.
With the new changes the LP will look like the below:
min Ω =
∑mi=1 θi +
∑si=1 zi
m+ s(5.5a)
n∑j=1
λj · nxij ≤n∑j=1
ωijdxij · xik ∀i = 1, .., q (5.5b)
n∑j=1
λj · xij ≤ θi.xik i = q + 1, ..,m (5.5c)
n∑j=1
λj · yij ≥n∑j=1
χij.yik ∀i = r + 1, .., s (5.5d)
n∑j=1
λj · nyij ≥ yik
n∑j=1
χij · dyij ∀i = 1, .., r (5.5e)
n∑j=1
ωij = θi ∀i = 1, .., q (5.5f)
n∑j=1
χij = zi ∀i = 1, .., r (5.5g)
n∑j=1
λj = 1 (5.5h)
λj ≥ 0 (5.5i)
In (5.5) we have successfully linearized the model in the presence of ratio variables. It is
interesting to know that the technique here can easily be applied to the normal Russell
graph measure we described in Chapter 3. There are various ways to approximate the
Russell graph measure, e.g. MIP [Coop 99a], but no linearization that we know of to
date has been proposed to solve the Russell graph measure through the LP. Although
we made (5.5a) practical, computationally, the model is not translation invariant and we
continue our search, this time through additive models.
Chapter 5. Methodology:Proposed Non-oriented Model 92
Slack based measures
Now, we turn our attention to the additive constructs, which would be:
ΩI(x, y) = max∑
w−i .s−i |(x1k − s−1 ..xmk − s−m) ∈ L(y), s−i ≥ 0
(5.6)
ΩO(x, y) = max∑
w+i .s
+i |x ∈ L(s+
1 + y1k..s+s + ysk), s
+i ≥ 0
(5.7)
The goal is to eliminate the maximum slacks possible. Some of the possible choices are:
1)a+ αΩI
b− βΩO
, 2)αΩI − βb+ΩO
3)αΩIΩO and 4)αΩI + βΩO
For the measure, we leave the first two choices out due to imbalance between input and
output slacks, and the third choice because of non-linear nature. We go with the general
form αΩI + βΩO. The weights are included in the definition of ΩI and ΩO and as can be
seen in (5.6), are defined for each input and output element individually. The scalars α
and β are applied to the summation of weighted slacks as a whole. As we reviewed in the
previous section, there are a variety of choices for the weight and scalar, which gives us
the freedom to search for a measure which possesses the desired properties listed at the
beginning of this section. Among the options available after close investigation, the (5.8)
fits the purpose. It is a hybrid measure inspired by the normalized weighted model by
Lovell and Pastor [Love 95b] and range adjusted model by Cooper et al [Coop 99a]. The
σxi and σyi are standard deviations of inputs and outputs, respectively. The motivation
behind these weights is to make the model units invariant. Weights are constant and will
not change for each DMU, and there is a claim in the literature that constant weights
have the advantage of potentially ranking DMUs [Coop 99a]. By choosing the standard
deviation as weights, we also achieve translation invariance, as we prove later.
Sk = maxλ,s±i
m∑i=1
s−iσxi
+s∑i=1
s+i
σyi(5.8a)
s.t.∑nj=1 λj · nxij∑nj=1 λj · dxij
− xik + s−i = 0 xik = nxik/dxik∀i = 1, .., q (5.8b)
Chapter 5. Methodology:Proposed Non-oriented Model 93
n∑j=1
λj · xij − xik + s−i = 0 i = q + 1, ..,m (5.8c)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = r + 1, .., s (5.8d)∑n
j=1 λj · nyij∑nj=1 λj · dyij
− yik − s+i = 0 yik = nyik/dyik ∀i = 1, .., r (5.8e)
n∑j=1
λj = 1 (5.8f)
λj ≥ 0. (5.8g)
Theorem Changing x to ax+ b and/or y to cy + d will not change the solution and the
objective value of (5.8). If we assume that S∗O is the optimum inefficiency for DMUO,
with input XO = [x1O...xmO] and output YO = [y1O...ysO], the claim is that changing any
xi to axi + b and/or any yi to cyi + d would result in the same inefficiency score S∗O.
Proof: With no loss of generality, we assume non-ratio input xm and output ys have
been transformed to xmnew = axm + b and output ysnew = cys + d. We try to find DMU
C’s inefficiency score in this new setting. We know that the standard deviation is scaled
but is not affected by a shift in data. So σxmCnew= a.σxmC and σysCnew
= c.σysC holds.
Through the following equations, we can also see that s−mCnew= a.s−mC .
1.∑n
j=1 λj · xmjnew = xmCnew − s−mCnew
substituting xmjnew and xmCnew with a.xmj + b and a.xmC + b we have:
2.∑n
j=1 λj · (a.xmj + b) = a.xmC + b− s−mCnew
a.∑n
j=1 λjxmj + b = a.xmC + b− s−mCnewwe know
3.∑n
j=1 λjxmj = xmC − s−mC substituting this into above gives:
4. a.(xmC − s−mC
)+ b = a.xmC + b− s−mCnew
this reduces to
5. −a.s−mC = −s−mCnew
Chapter 5. Methodology:Proposed Non-oriented Model 94
SCnew = maxm−1∑i=1
s−iCσxi
+smCnew
a.σxm+
s−1∑i=1
s+iC
σyi+ssCnew
c.σyr(5.9a)
s.t.∑nj=1 λj · nxij∑nj=1 λj · dxij
− xiC + s−i = 0 xiC = nxiC/dxiC∀i = 1, .., q (5.9b)
n∑j=1
λj · xij − xiC + s−i = 0 i = q + 1, ..,m− 1 (5.9c)
n∑j=1
λj · xmjnew − xmCnew + s−mCnew= 0 (5.9d)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = r + 1, .., s− 1 (5.9e)
n∑j=1
λj · ysjnew − ysCnew − s+sCnew
= 0 (5.9f)∑nj=1 λj · nyij∑nj=1 λj · dyij
− yiC − s+i = 0 yiC = nyiC/dyiC∀i = 1, .., r (5.9g)
n∑j=1
λj = 1 (5.9h)
λj ≥ 0. (5.9i)
In the same fashion, we can deduce that s+sCnew
= c.s+sC and it is clear that (5.9) becomes
the same as (5.8) for DMU C (C instead of DMUk). So the optimum objectives have to
be the same as well. This is a useful property and allows us to relax the strictly positive
condition on inputs and outputs, which we mentioned earlier. For the ratio variables, it
is easy to see that scaling has no effect whether it is applied to the ratio as a whole or
the components, because any scaling to the components can easily be represented by an
scaler to the whole ratio which would reflect on the standard deviation as well and like
the above it will result in the same optimum. When it comes to the shift in the origin
for a ratio variable, it is strictly bounded to the whole ratio rather than its components.
The reason behind this is we are seeking the maximum slack of the ratio variables as a
whole and the slacks for denominator and numerator are not considered separately. The
components of the ratio variable is used merely to build the right PPS and the shift from
Chapter 5. Methodology:Proposed Non-oriented Model 95
the origin should be applied to the whole PPS, in a sense that the original PPS can be
obtained by a simple reverse transformation. The shift in the components of the ratio is
not allowed because it will result in a completely different PPS with no clear formula to
get us back to the original PPS.
Now, to make the concept that we have already developed into a computational reality,
we have to transform the above to LP. We linearize it by introducing new variables as
follows:
ωij = nyij − dyij · yik, λj · s+i = ∆ij,
σij = nxij − dxij · xik, λj · s−i = Φij.
The additive model then becomes:
maxn∑j=1
q∑i=1
Φij
σxi+
m∑i=q+1
s−iσxi
+n∑j=1
r∑i=1
∆ij
σyi+
s∑i=r+1
s+i
σyi(5.10a)
s.t.n∑j=1
λj · xij − xik + s−i = 0 ∀i = q + 1, ..,m (5.10b)
n∑j=1
λj · σij +n∑j=1
Φij · dxij = 0 ∀i = 1, .., q (5.10c)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = r + 1, .., s (5.10d)
n∑j=1
λj · wij −n∑j=1
∆ij · dyij = 0 ∀i = 1, .., r (5.10e)
n∑j=1
λj = 1 (5.10f)
λj ≥ 0. (5.10g)
In terms of our desired properties, this model readily satisfies propertes 3, 4, 5 and,
6 mentioned in 5.2.1. Properties 3 and 4 are proven above and we will prove 5 and 6
next. We need to investigate property 7 and see if we can amend the model to satisfy
properties 1 and 2.
Chapter 5. Methodology:Proposed Non-oriented Model 96
Theorem Changing any of the xik to x′ik < xik and/or yik to y′ik > yik will decrease
the objective value of (5.8) thus making the unit more efficient which means less ineffi-
cient since our model, at this stage, measures inefficiency. Proof: Let us assume that
everything is kept the same except for xik, which is replaced by x′ik < xik. There are
two situations x′ik ≤ xik − s−∗i or xik − s−∗i < x′ik < xik. The first case will reduce
to x′ik = xik − s−∗i because further reduction will push the point out of the PPS. It is
obvious that the corresponding slack will be zero and the efficiency score improves. For
the second scenario, let us assume that the optimal objective for the improved point is
Ω′∗. It is clear that taking s−i = s′−∗i + xik − x′ik and keeping the rest of the variables the
same will lead to a feasible solution to 5.8 with a bigger objective value, so the optimal
objective value would certainly be bigger than Ω′∗.
Theorem The objective value in (5.8) is invariant to alternative optima. Proof: The
way the objective is constructed guarantees this property since the alternative solution
is by a different set of intensity variables λs and they are not included in the objective;
additionally, the whole aggregate of slacks, which is of interest, is maximized and would
stay the same even if alternative solutions exist.
We can see that properties 1 and 2 are not satisfied by the model in its current
form. In the literature, whenever the goal has been to eliminate all slacks, a number of
techniques have been employed to bring the score into some meaningful range. Among
them, we can refer to MIP, extended additive model, constant weighted additive model,
normalized weighted additive model, GEM, enhanced Russell (which is like SBM), and
RAM. In some cases bringing the score into the zero to one range is done ex post facto, in
other words, after optimal slacks, s−∗i and s+∗i , are calculated. Here are some examples:
Ω =m∑i=1
sik−
xik+
s∑i=1
sik+
yik,
Ω = 1− 1
m+ s
m∑i=1
s−∗ixik
+s∑i=1
s+∗i
yik + s+∗i
,
Chapter 5. Methodology:Proposed Non-oriented Model 97
Ω = 1− 1
m+ s
(m∑i=1
s−∗ixik − xi
+s∑i=1
s+∗i
yi − yik
),
Ω = −m∑i=1
1/σ−i · s−i +s∑i=1
1/σ+i · s+
i ,
Ω =
[1 +
1
m
m∑i=1
s−ixik
+1
s
s∑i=1
s+i
yik
]−1
,
Ω =1− 1
m
∑mi=1
s−ixik
1 + 1s
∑si=1
s+iyik
,
Ω = 1− 1
m+ s
m∑i=1
s−∗ixi − xi
+s∑i=1
s+∗i
yi − yi, and
Ω =1− 1
m
∑mi=1
wi·s−ixi0−minj xij
1 + 1s
∑sr=1
vr·s+rmaxj yrj−yr0
.
5.2.3 Making sense of the inefficiency score
Although we could have used some of the above ex post facto transformations, they
did not bear all the characteristics we wanted, so we decided to do something novel.
To make the score meaningful and bounded, we propose the following ex post facto
treatment. First, let us define a dummy DMU, DMU D, with the following input and
output characteristics: xiD = maximum(xij) for every i = 1, .., s and j = 1, .., n
yiD = minimum(yij) for every i = 1..r and j = 1..n.
Let us also assume that equation (5.8) could be solved, as we will show later, and generate
the objective S∗d . We then normalize the efficiency score for each DMUk by 1− S∗kS∗d
. Since
the DMU k is the worst DMU, the normalized score for each DMU is bounded by zero
and one and reflects the relative efficiency.
Another way to normalize is to choose the worst DMU, the one with the highest slack,
from existing n DMUs; this will make the DMU with the largest slack attain the score of
zero and overall scores would be lower in general. This is because the inefficiency of the
worst DMU in the set is still smaller than, if not equal to, the dummy DMU. However,
Chapter 5. Methodology:Proposed Non-oriented Model 98
what we could gain is a slight increase in the discrimination power. The reader should
bear in mind that the scores are relative and true for the data set under study and should
not be taken out of context. In both situations, any slight change in the data would affect
the scores, although DEA is, in general, data oriented and it is not specific to this case.
The case of heuristically adding DMUs to the PPS has been practiced for other
purposes. Thanassoulis et al. [Than 12] proposed the idea of adding unobserved DMUs
with a similar mix to the anchor DMUs to get better envelopment. The added DMUs
reflect a combination of technical information and the decision-maker’s value judgment.
They have a complex procedure for doing this and the result is an extended frontier,
which envelops the data better.
Our model accounts for individual variations in input and output rather than attempt-
ing a uniform shrinkage or expansion. The units will then be less efficient compared to
radial models. Our model, however, does not satisfy the second part of property 2, scoring
zero very rarely (except when we normalize scores using the worst performer). Another
possible criticism is that dividing the inefficiency with a notably large number (the worst
performer has the highest slack, and in the case of the dummy DMU, the slacks could
be very large, in particular, if the variables are far apart) will weaken the discrimination
power of our measure and cause the crowdedness of ex post facto inefficiencies within a
certain interval. This problem exists for most of the non-oriented models in the litera-
ture and is not specific to ours but there is room for improvement here. To preserve the
fairness of DEA and the fact that we like each DMU to be seen under the best light, and
increase the discrimination power, we suggest clustering the DMUs and for each cluster,
designating a dummy weak point. For example when dealing with a large database of
bank branches, clustering the data into three: small, medium and large, and creating a
dummy DMU for each group, would result in a better discrimination power.
Chapter 6
Methodology: Approximating the
Frontier in BBC Model
We demonstrated in the Chapter 1 mixing ratio with normal variables, in conventional
DEA, distorts the frontier. By conventional DEA we mean known DEA models without
any provisions for using ratio variables. We have proposed a non-oriented model to deal
with ratios in Chapter 5, however, in the specific case of the BCC model, where the ratio
variables are on the side of orientation, the problem has remained unsolved. In other
words, we failed to linearize the BCC model, when we aim to reduce the inputs, only
some in a ratio form, or focus on raising the output levels, where only some outputs
are in the form of a ratio, and not all. We did a review of the limited literature on
non-analytical techniques to attain the production frontier in Chapter 4 in the context
of DEA. As mentioned in Chapter 4, Emrouznejad et al. [Emro 09] clearly showed the
BCC model cannot be linearized when the ratios exist on the orientation side. Let us
remind ourselves why the BCC output-oriented model with output ratios could not be
linearized mathematically. The original BCC model with consideration of the proper
convexity looks like:
99
Chapter 6. Methodology: Approximating the Frontier in BBC Model100
maxλ,η
η (6.1a)
s.t.n∑j=1
λj · xij ≤ xik i = 1, ...,m (6.1b)
n∑j=1
λj · nyij ≥ yik
n∑j=1
η · λj · dyij yik = nyik/dyik, i = 1, ..., r (6.1c)
n∑j=1
λj · yij ≥ η · yik i = r + 1, ..., s (6.1d)
n∑j=1
λj = 1 (6.1e)
λj ≥ 0. (6.1f)
Even by substituting η · λj by γj and adding the auxiliary constraints, still there is no
guarantee that the LP (6.2) solution satisfies η · λj = γj. As seen here:
maxλ,η,γ
η (6.2a)
s.t.n∑j=1
λj · xij ≤ xik i = 1, ...,m (6.2b)
n∑j=1
λj · nyij ≥ yik
n∑j=1
γj · dyij yik = nyik/dyik, i = 1, ..., r (6.2c)
n∑j=1
λj · yij ≥ η · yik i = r + 1, ..., s (6.2d)
n∑j=1
λj = 1 (6.2e)
n∑j=1
γj = η (6.2f)
λj ≥ γj ≥ 0. (6.2g)
As a matter of fact, we aim to find a solution for such cases with approximation methods
and find close to optimal solutions. The method, which has been developed by us and
Chapter 6. Methodology: Approximating the Frontier in BBC Model101
will be discussed in this chapter, is a heuristic one since the problem does not have a
clear-cut solution. While going through different techniques, as discussed in Chapter
3, we were inspired by the Monte Carlo method to generate samples of the nonlinear
frontier. The method is not exactly a Monte Carlo method, as a point on the frontier
does not have a simple mathematical formulation. The rest of this chapter is organized
as follows: we look at approximation methods to generate parts of PPS in theory, we will
study the challenges in practice next and at the end we present the developed heuristic
by us.
6.1 Partial Improvement: Approximation methods
The production possibility set consists of observed DMUs plus any convex combination
of them. We also know that the frontier consists of DMUs on the desired edges (facets)
of the PPS. Our goal is to create samples of the PPS and retain the best performers
at each iteration. We hope that these local best performers will lead us collectively to
the true frontier (at least parts of the true frontier). In the Monte Carlo method, three
basic things should exist: a) a mathematical formulation, b) a reasonable variation of
each input, and c) an idea of an acceptable output. In our case, we are seeking the
best performers, and we know what is acceptable as a best performer: it should use the
minimum input to produce the maximum output. We also know how the players vary
(convex combination of DMUs), however, we do not have one formula that transforms the
DMUs into a best performer, rather, we have an LP that pinpoints the best performer
for a specific DMU. This means that our best performer is not represented by a number
or, better said, does not have a distribution/average and a confidence interval which are
usually the output of the Monte Carlo approach. This is why we call our method a pseudo
Monte Carlo method. Theoretically, if we repeat this procedure a sufficient number of
times for a specific DMU, the best performer (known as the target) will become stable
Chapter 6. Methodology: Approximating the Frontier in BBC Model102
and we can conclude that this is a point on the true frontier.
6.1.1 How to generate PPS progressively
As mentioned above, the main assumption behind PPS is that any combination of DMUs
will be feasible unless stated otherwise. The exceptions are the cases with weight restric-
tions or multipliers’ restriction in the envelopment form as well as the production nature,
which may require convexity. Convexity limits the summation of DMU weights to one.
Each weight combination applied to a set of DMUs leads to a point in PPS. Now that we
know what is a reasonable variation of weights, what we struggle with is that there are
endless possibilities for these weights within the limit and in conjunction with the others.
If we had access to every possible weight set, we could generate the entire PPS, but
the number of weight sets is infinite. The weight matrix is generated either by random
weights or scanning the space for weights. Each row (vector) provides us with a unique
set of weights, which can generate a hypothetical DMU, or can be assumed as one input
in the Monte Carlo method.
We assume that the convexity axiom holds, so each weight should be less than, or
equal to, one; while the summation of all weights should equal one. Even with this
assumption, our job is not easy, as there are infinitely many real numbers between zero
and one. One idea is to limit the choices, for example defining a 0.1 resolution so that
the choices will be 0.1, 0.2, .., 1. Another way to deal with this problem is to generate
random weights that are less than one and satisfy the convexity axiom. Theoretically,
when we have a finite set of numbers for each weight, we should be able to produce all
combinations under the convexity constraint with “for loops”.
6.1.2 Challenges
The idea o generate the entire PPS works in theory but in practice it is hardly doable
due to challenges discussed here.
Chapter 6. Methodology: Approximating the Frontier in BBC Model103
Figure 6.1: Average non-zero weighs of size p vs Resolution
Sparsity
If we limit the choices between zero and one for each weight with a specific resolution,
while imposing the convexity on top of that, we will end up with a matrix with many zero
elements. This sparse matrix creates computational challenges for software programs.
This is mainly because of the limited number of options to choose from, such as 5 nonzero
choices when the resolution is 15, and the fact that, on average, the number of choices
will be reduced to half since they have to add up to one. So the speed of losing choices is
12n
. As we will prove, the resolution should go to zero to avoid ending up with a sparse
matrix as shown in Figure 6.1.
Lemma:At each selection the interval breaks into half on average. Proof: For each
weight, we can select from the choices within the interval, from minimum to maximum,
with the same probability. As a result, on average, at each iteration, we cut the inter-
Chapter 6. Methodology: Approximating the Frontier in BBC Model104
val into two parts. Assuming that the interval between zero and one is divided by p
or the resolution is 1p, the probability of each selection is 1/(p + 1) and the length of
interval for the next selection will be 0, 1p, 2p, ...p
p, so, on average, the next interval will
be 1p+1∗ p(p+1)
2p= 1
2and similarly for an interval of length a, the length of the remaining
interval, after one coefficient selection, will on average be a2.
Theorem: The expected number of nonzero weights for the weight vector of size
w = p is: p2
2p−1
Proof: Please note that with the resolution p the maximum number of nonzero weights
in any sets will be p and this as the rest w − p will be set to zero, when w > p. We are
looking at w = p which gives the maximum non zero weights. Assuming that 1/p is a
defined resolution, then you can treat the [0, 1] interval like p identical balls. The ways
we can have k nonzero weights is like dividing p identical balls into k distinct bins in a
way that each bin has at least one ball (each bin represent a weight). Once we decide
which k weights from 1, .., p to choose, then partitioning between them can be done in(pk
)×(p−1k−1
)ways.
The total number of ways that p balls can be assigned into n weights (empty weights
are accepted) can be thought of as allocating 2p balls into p bins as before, and then take
one ball out of each bin to create those empty bins or zero weights.(
2p−1p−1
)Now that we have the probability of having k nonzero weights, we can find the
expected value asM=
∑pk=1 k∗(
pk)(
p−1k−1)
(2p−1p−1 )
.
We have: k ∗(pk
)= p ∗
(p−1k−1
)then∑p
k=1 k ∗(pk
)(p−1k−1
)= p ∗
∑pk=1
(p−1k−1
)2= p ∗
(2p−2p−1
). Noting that
(2p−1p−1
)= 2p−1
p∗(
2p−2p−1
), the
expected M is given by:∑p
k=1
k∗(pk)(p−1p−1)
(nk)(p+k−2k−1 )
= p2
2p−1
In general, the weight vector size does not necessarily equal p and in fact itself is a
variable, for which the optimum number is not yet known. When the sampling method
is chosen, then the weight vector length must equal the sample size. Moreover, the
Chapter 6. Methodology: Approximating the Frontier in BBC Model105
Figure 6.2: Sparse Matrix: Average number of non-zero weights vs Resolution
resolution 1p
should have p large enough to accommodate the desired sample size: p
requires to be at least equal to the sample size. We calculated that for weight vector, of
size w, the average number of nonzero elements is given by the following:∑w
k=1
k∗(wk)(p−1k−1)
(p+w−1p−1 )
.
However, independent from the weight vector size, to have all elements at nonzero
resolution, we need to approach zero in a way seen in the following theorem and seen in
figure 6.1.2.
Theorem: For having all the weights nonzero on average, the resolution should go to
zero. Proof: Using the above lemma, if the resolution is 1p, which means we have p
choices to start with, then for the k weight to be nonzero, we should have p2k−1 ≥ 1 ,
k ≤ p. We transform the previous inequality to 1p≥ 1
2k−1 and then 1p≥ 21−k. Since for
k >> 1, we can say k − 1 ≈ k, then k ≥ log 11p
, which implies that for the number of
nonzero weights to grow, resolution should go to zero.
Chapter 6. Methodology: Approximating the Frontier in BBC Model106
Exponential number of iterations
From the previous part, we concluded that the smaller the resolution, the better, as a
sparse matrix will shrink. However, this will lead to another challenge. The problem is
that the number of iterations grows exponentially when the resolution becomes smaller
and the number of DMUs, n, increases, as can be seen in figure 6.1.2. This imposes a
computational issue with software packages. Small resolutions and the use of for loops is
not a recipe for success.
The number of iterations is directly related to the number of weight sets. Finding the
number of weight sets that totals one is like assigning p identical weights to k DMUs,
given that zero can also be assigned. As highlighted before, this is like having 2p identical
weights assigned to k DMUs with at least one to each and then taking one weight out of
each unit. This is a classic problem and it can be done in(
2p−1k−1
)ways. If we only look at
all nonzero weight sets, the number equals(p−1k−1
). For instance having only 4 DMUs and
0.1 resolution, 969 weight sets are generated with 84 all-positive sets. By changing the
resolution to 0.05, the weight sets will grow to 9,139 with 969 all-positive sets. Add one
more DMU to the set and weight sets will reach 82,251, with 3,876 all-positive sets. This
is why, as we will see later, we have moved from a for loops strategy to random selection.
6.1.3 Pseudo Monte Carlo method
We have explained the Monte Carlo method in Chapter 4. Recall that once a relationship
between inputs and outputs is established, random values for each input are drawn from
their respective distributions and for each input value, an output value is calculated.
Based on these values, the most probable value for the output is identified. In this
research, we have defined inputs and outputs in a significantly different manner than is
done in a typical Monte Carlo simulation. Our method takes the idea of generating inputs
Chapter 6. Methodology: Approximating the Frontier in BBC Model107
14000000
12000000
10000000
t sets
6000000
8000000
Num
ber o
f weigh 4 DMUs
5 DMUs
6 DMUs
7 DMUs
4000000
6000000N
8 DMUs
2000000
00.05 0.1 0.2
Resolution
Figure 6.3: Number of iterations grows exponentially with smaller resolution when the
number of DMUs increases.
Chapter 6. Methodology: Approximating the Frontier in BBC Model108
(in our case, DMUs) by continuing to use the convexity rule rather than a distribution;
it then uses not an explicit mathematical formula but an optimization procedure to
establish the output benchmark, for the DMU under evaluation. Last, but not least, we
do not focus on the most probable benchmark, but instead, on the best possible one. This
is why our method is not exactly a Monte Carlo simulation but similar to it. Another
thing that distinguishes our model from a conventional simulation is the learning process.
At each step, if there has been an improvement in the benchmark, we then check to see
the constructs of the benchmark and add them to our inputs (DMUs).
Now, we explain how we have made this idea work. When we have finite resolution,
we have limited options for weights but by using nested “for loops”, although time con-
suming, we can produce all possible hypothetical DMUs with the options available. The
method of fixed resolution resulted in a number of complexities, as explained in Section
6.1.2. Our solution was to, instead of covering all possibilities, draw weights randomly,
while meeting the convexity axiom. Choosing weights randomly without restricting our-
selves to a fixed resolution means that we will have unlimited options for weights between
zero and one, in contrast to p options at most with 1p
resolution. However, we cannot
exhaust all possible combinations because their number is infinite. When the number of
DMUs is large, the sparse matrix issue still poses a challenge. To fix the problem, we
will limit the number of DMUs by choosing only a sample of them. The trick here is that
this procedure is repeated many times until we visit most DMUs and produce enough
hypothetical ones to build the frontier.
We believe that a sample of existing DMUs and a randomly generated weight set
under convexity axiom is all we need to generate a point that belongs to the PPS. In this
study, we are more interested in finding the best performer, or, at least, better performers,
that envelope the traditional frontier. So it makes sense to choose our building blocks
from traditionally identified DMUs and, in the process, keep the ones that perform better
and discard the rest. Keeping underperformers will only use some of our storage capacity
Chapter 6. Methodology: Approximating the Frontier in BBC Model109
without generating any valuable outcome.
The procedure can be summarized in the following steps:
1. Run DEA and find the DMUs on the conventional frontier (here the BCC frontier,
treating ratios as normal variables);
2. Select randomly p DMUs from the data set;
3. Generate hypothetical DMUs out of p parents;
4. Select the ones that happen to be above the DEA conventional frontier, and add
them to the frontier set;
5. Discard all the other hypothetical ones; and
6. Go to step 2 if the convergence criteria has not yet been met.
In the following, we elaborate on the keep or discard rule (heuristic) and, of course, the
convergence criteria.
6.1.4 Keep or discard, an LP feasibility problem
In other sampling methods like Gibbs sampling and the Monte Carlo Markov chain
process, the effort is made to get the samples that, indeed, represent the population.
They usually have a burn-in period and they throw away approximately the first 5000
iterations to make sure that the chain has reached its stationary state and samples are
truly from the population. For us, the population we are seeking and want to generate
samples from are the relatively better performers and we should have a way to eliminate
the samples not representing our interest.
Because of the nonlinear nature of our problem, we cannot know in advance if a
hypothetical DMU is a high performer compared to the rest. For each point generated,
we keep it if it happens to be outside the enveloped space. If the new point is enveloped
Chapter 6. Methodology: Approximating the Frontier in BBC Model110
by the current frontier (which means it satisfies the constraints of the conventional BCC
model), we leave it out. Because if the point is feasible it has already been enveloped and
does not provide us with any potential improvement. It is worth noting that the points
below the conventional frontier (inside the conventional PPS but not on the frontier)
could convey some information if they are identified as being benchmarks. Because of
the nonlinear nature of our problem with the ratios involved, it is possible that the target
for some DMUs occur below the traditional frontier. In this research, however, we focus
on the potential improvement and do not worry about possible underestimation of some
units. Just to clarify: if the actual target for a specific DMU is below the traditional
frontier but we had measured its performance against a point on the frontier, presumably
we assessed that as being less efficient than what it actually is. This means that we had
put pressure on the unit and expected it to improve more than it could.
From the standpoint of computational complexity, we need to check if the point is
feasible for conventional DEA. Establishing if an LP model has a feasible solution is
essentially as hard as actually finding the optimal LP solution. The former cannot take,
on average, twice as many operations as the latter within a factor of 2 on average, in the
simplex method. As we are concerned only about the feasibility, we can solve the problem
by a normal LP solver, setting the objective function to be a constant. Feasibility study is
even difficult for a mixed integer program because if no feasible solution exists, then it is
necessary to go through the entire branch-and-bound procedure (or whatever algorithm
we use) to prove this. There are no shortcuts in general, unless we know something useful
about our model’s structure. For example, if we are solving some form of a transportation
problem, then we may be able to ensure feasibility by checking that the sources add up
to at least as great a number as the sum of the destinations.
Here, the intention is that as soon as the point is proved to be feasible in a conventional
LP, it is ignored and the next candidate is tested. Alternatively, we can check if the new
candidate is any better than the rest (not likely to be feasible in the conventional DEA)
Chapter 6. Methodology: Approximating the Frontier in BBC Model111
by checking if at least one input is less than, and one output is more than, or equal to,
the existing ones. This way we may save solving a few unnecessary LPs.
6.1.5 Convergence
It is important to know how many times the procedure described in steps 1-5 in sections
6.1.3 should be repeated. Simply put, we need to know when to stop and decide that
the best benchmark for the DMU has been found. If there was a way to generate all
data in the PPS, then pinpointing the best benchmark would be easy. But this approach
would be computationally infeasible and, as a result, deciding on when to stop becomes a
function of variables affecting how many data points are visited and how that affects the
benchmark. How many points are visited depends on the constructs: the quality of the
weight matrix, the sample size and the number of times we sample. Sample size could
be fixed or could be a random variable itself. Having many different factors has made it
a real challenge to create and agree upon rules for convergence.
We try to create as many data points as possible from data at hand, and we also need
to check if the benchmark improves or not. For the former, we know a weight vector
applied to a set of DMUs will generate one hypothetical unit. To generate as many
hypothetical DMUs as possible, there are two strategies to consider. One is to generate
as many hypothetical DMUs as possible from all DMUs, all at once, in other words,
having a long weight vector for each iteration (resulting in a huge matrix of weights
which will lead to a sparsity problem). The other is to have samples of DMUs drawn
from the pool of DMUs we have and use a smaller weight vector (size of the vector equals
the sample size).
There is another question to be answered: what sample size should we choose? We
will discuss this in the next section. However, no matter what sample size we choose, we
need to sample enough to decrease the chance of a DMU being missed out. How many
times we sample and the size of different weight sets we employ will eventually define
Chapter 6. Methodology: Approximating the Frontier in BBC Model112
the number of iterations. In the following, we will discuss the sample size effect on the
frontier.
Effects of sample size on the nonlinear frontier
Having n DMUs and a fixed sample size, k, mathematically, there is an answer for how
many non-identical samples you can draw. This is famously known as the “N choose
K”, denoted by(nk
). This is the minimum number of samples to draw to ensure that all
DMUs are visited. A simple calculation shows that this solution is not practical, in that
even for a small problem involving 50 DMUs, two million samples of size five are required
to exhaust the options. To control that, we define the completeness ratio, as the number
of samples we use divided by(nk
). It should be noted that sample sizes of k and n − k
require the same number of repetitions to get completeness ratio of 100%.
It is obvious that a sample size k implies that we have included sample sizes 1, .., k−1
in some way because weight sets can include zeros. Therefore, between sample size k and
N − k, the larger sample size is expected to generate a more accurate frontier, given
that the quality of the weight vectors are comparable. To quickly demonstrate that, for
example in Chapter 4 where we had 100 DMUs, we tested sample sizes of 10 and 90. The
procedure was repeated 100 times using 500 weight vectors. The larger sample size took
just 1% more time, but found DMUs with higher relative performance. Surprisingly, the
number of above the linear frontier DMUs was 36% fewer but with a higher efficiency
score, on average. You can see in the Figure 6.4 that more DMU efficiency scores were
dropped when we re-evaluated them introducing the unobserved DMUs generated by
sample size 90. It is worth mentioning that for the same number of hypothetical DMUs
generated (fix number of samples from PPS), the effect on the nonlinear frontier caused
by sample size was not significant, as seen in figure 6.5. However, we caution the reader
not to generalize this observation because the initial sample size affects the quality of
posterior samples and how well mixed they are (exploring all parts of the sample space).
Chapter 6. Methodology: Approximating the Frontier in BBC Model113
100%
80%
90%
50%
60%
70%
fficcien
cy scores
30%
40%
50%
Reevalua
ted ef Sample size 10
Sample size 90
0%
10%
20%
0%0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100
DMUs
Figure 6.4: for same completeness ratio, larger sample size wins
Chapter 6. Methodology: Approximating the Frontier in BBC Model114
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Effic
ienc
y Sc
ore
3 DMUs
5 DMUs
7 DMUs
10 DMUs
20 DMUs
20.00%
30.00%
40.00%
DMUs
Figure 6.5: Sample size affect is small if the number of hypothetical DMUs generated
stays the same
Chapter 6. Methodology: Approximating the Frontier in BBC Model115
90%
100%
70%
80%
50%
60%
ncy
scor
e
100 rep, 500 ws
30%
40%
50%
Effic
ien 100 rep, 500 ws
300 rep, 200 ws
500 rep, 200 ws
10%
20%
30%
0%
10%
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 80 84 88 92 96 100DMUs
Figure 6.6: Increasing the number of unobserved DMUs, Sample Size 5
Effects of the number of iterations and weight vectors on the nonlinear frontier
Intuitively increasing the pool of unobserved DMUs must make the approximation better
since the ones over the conventional frontier are chosen from a bigger pool (more choices)
and they may eventually contribute to the formation of the nonlinear frontier. However,
as depicted in figure 6.6, observation for a sample size of 5 shows that although going
from 50,000 to 60,000 makes a difference and increase the number of unobserved, and
better performer DMUs, the gain from 60,000 to 100,000 has low impact, mostly because
of saturation. This saturation phenomenon is what we will look for to claim that con-
vergence has happened. It is interesting to mention that not only the pool of unobserved
DMUs are important but also how this size has been constructed, as these do make a
difference. There are two ways to change the pool of unobserved DMUs (in terms of
Chapter 6. Methodology: Approximating the Frontier in BBC Model116
90%
100%
Chart Title
70%
80%
90%
50%
60%
70%
ncy
scor
e
1000 rep, 100 ws
30%
40%
%
Effic
ien 500 rep, 200 ws
250 rep, 400 ws
200 rep, 500 ws
10%
20%
0%0 2 4 6 8 10121416182022242628303234363840424446485052545658606264666870727476
DMU
Figure 6.7: The same number of unobserved DMUs but different constructs, Sample Size
5
quantity): the number of weights vectors (matrix rows), and the number of samplings
(repetition). The question is: does it matter which to change? Observation in figure 6.7
shows that it does, and for the same pool size, depending on the weight set share, and
repetition share, approximation of the nonlinear frontier can improve.
Ad-hoc rules for convergence
We have already seen how different factors contribute to the quality of the estimated
nonlinear frontier and of course, the time we need to reach saturation. The complex
nature of the interactions between the above factors makes it even more difficult to
provide the analyst with a precise and robust solution, and perhaps none at all.
There is no definite answer for how many weight vectors are sufficient or how many
times the sampling should be repeated or what sample size is the best. However, an
Chapter 6. Methodology: Approximating the Frontier in BBC Model117
optimum balance between the above factors (sample size, weight sets and repetition)
is a sensible starting point, which can be improved, via trial and error. The reality is
that there is no closed form mathematical formula to generate the magic number of each
parameter before convergence or shorten the time to converge, thus, we rely on heuristics
and ad hoc approaches.
Each case should be studied separately, when the approximation is used to find the
ultimate benchmark. Having said that, through our experience, there are certain steps,
which we advise the reader to take for every problem:
1. Start by looking at the data; exclude all DMUs that are less than 50% efficient in
a conventional DEA model or the 25% bottom percentile (whichever is greater).
This eliminates the low quality DMUs from entering the selection process, as they
are less likely to contribute to the nonlinear frontier.
2. Assume a sample size of P% of the remaining DMUs. Using judgment and depend-
ing on the number of DMUs. If the number of DMUs is very large, then clustering
them before sampling is recommended as explained in [Naja 05]. Clustering cri-
terion is mostly based on envirnmental factors such as population concentration
in the area, income per capita, in order to create a peer group for the DMU un-
der evaluation. The intention behind this suggestion is mainly practicality. If the
sample size becomes very large, then the sparsity problem will occur and compu-
tation will be hard due to the length of the weight vector. We know that in DEA
each unit is benchmarked against units in its peer group, so when we have a huge
number of DMUs, grouping them makes sense and reduces the size of the pool and
consequently, the sample size.
3. Choose x number of weight vectors. There is no magic number in here but we
started with 500.
4. After one full run, the unobserved DMUs, outside the envelopment surface of the
Chapter 6. Methodology: Approximating the Frontier in BBC Model118
conventional efficient frontier, are kept and added to the set.
5. DMUs on the conventional frontier are re-evaluated and the efficiency scores are
calculated. The procedure is assumed to be final when, after a number of consecu-
tive runs, the efficiency scores have not dropped further, which means the nonlinear
frontier has not changed. This number could be two, or more depending on how well
mixed the sampling is. Well mixed means samples are equally drawn from all parts
of space. A rule of thumb is that without sampling, two times with no change can
indicate saturation and with sampling, and sample size = P%sample space then at
least P iteration with no change is a sign of convergence.
6.2 Remarks
The problem we had was that the BCC model with ratio variables at the side or orien-
tation became nonlinear and there was no technique we knew of to that would linearize
it. If the conventional DEA is employed, then a linear frontier is generated but we know
this is not the true one with the ratios in place.
The approximation method proposed here aims to find a better frontier compared to
previous models. We want to emphasize that this is not necessarily the absolute frontier
but it is a close approximation. This is why we advise the user to reformulate the problem
and try to avoid ratio variables, if possible. However, if that is not an option, then this
method should be used.
The technique suggested here is intended to build parts of, if not all, the true frontier.
We advise the users that this method should not be used as a stand-alone solution to
make decisions on but rather help the analyst to probe further for possible improvement
and use this alongside other methods and their managerial judgment to come to a final
decision.
Chapter 7
Realization, Case Study and Results
In this chapter, we test the models presented in Chapters 5 and 6 to have a better
understanding regarding how the models work in practice. We can also compare the
results to the more conventional methods and compare the outcomes. To achieve this,
it is necessary to build the theoretically proposed model, using computer software as
there is no ready-made code for our model. We have coded the models using MATLAB
by MathWorks. This has proven to be a very time-consuming and difficult task, not
directly contributing to the core research but nevertheless essential, as we need to show
how the developed theory can be used. Our contribution in the realization of the model,
although not yet a commercial product, has shown itself to work well and sufficiently fit
for the purpose of testing. The code, however, could be incorporated in already available
DEA software to extend its capabilities in dealing with ratios. We briefly explain how we
transform the closed form linear programming formulation into a matrix form to make it
work with MATLAB. We also apply the code to a case study of about 130 bank branches,
extracted from one of the main Canadian banks in a certain region.
119
Chapter 7. Realization, Case Study and Results 120
7.1 Realization of the non-oriented model using MAT-
LAB
To refresh our memory, let us look back at our theoretical additive non-oriented linearized
model, appeared in Chapter 5, which is units and translation invariant:
maxn∑j=1
q∑i=1
Φij
σxi+
m∑i=q+1
s−iσxi
+n∑j=1
r∑i=1
∆ij
σyi+
s∑i=r+1
s+i
σyi(7.1a)
s.t.n∑j=1
λj · xij − xik + s−i = 0 ∀i = q + 1, ..,m (7.1b)
n∑j=1
λj · σij +n∑j=1
Φij · dxij = 0 ∀i = 1, .., q (7.1c)
n∑j=1
λj · yij − yik − s+i = 0 ∀i = r + 1, .., s (7.1d)
n∑j=1
λj · wij −n∑j=1
∆ij · dyij = 0 ∀i = 1, .., r (7.1e)
n∑j=1
λj = 1 (7.1f)
λj∆ijΦij ≥ 0 (7.1g)
ωij = nyij − dyij · yik (7.1h)
σij = nxij − dxij · xik (7.1i)
MATLAB works with matrices. The mission has been to transform the above into
the form [Aeq][variables]=[Beq] with the objective of [f][variables]. We start from the
variables: there are n weights λj (one for each DMU), r × n transformed slacks for the
ratio outputs, ∆ij, s− r output slacks(shortfalls), q ∗ n transformed slacks for the ratio
inputs and finally, m − q input slacks (waste) variables. In total, the number of our
variables will equal the number of units, plus the number of normal variables (non-ratio
inputs/outputs) added to the number of ratio variables times the number of units. It
is evident that each ratio variable will add n variables to the LP. This is not a concern
Chapter 7. Realization, Case Study and Results 121
because the number of variables is not critical for the solvers, whereas the number of
constraints is.
Now each line of the constraints above needs to be written in the matrix form to fit
the [Aeq][variables]=[Beq] format. The (7.1c) non-ratio input part is re-written as:[[X](m−q).n , [0](m−q).(n×r+s−r+n×q) , Identity(m−q)2
][variables] = xk.
The (7.1d) ratio input part is expressed as:[[σ]q.n , [0]q.(n∗r+s−r) , [dx]q.(n∗q) , [0]q.(m−q)
][variables] = [0]q.1,
where dx is a matrix composed of denominators for every ratio input in all DMUs on a
diagonal block of size n, given by:
[dx1,1, dx1,2, .., dx1,n, 0, .., 0, 0, .., 0
0, .., 0, dx2,1, dx2,2, .., dx2,n, 0, .., 0
... ...
0, .., 0, 0, .., 0, 0, .., 0, dxq,1, dxq,2, .., dxq,n].
We skip the explanation on the output constraints as they are rearranged in the matrix
format in the same fashion as inputs. The objective function should be constructed as
[f]*[variables]. The slack variables for the non-ratio variables in the objective function is
easy to produce — just a vector of inverse standard deviation for each input and output.
For the ratio variables, because of the double summation in the objective, each reverse
standard deviation is transformed to a vector of size n and all the vectors come together
to build the [f ]. f = The artificial low performer DMU is added to the set from the
beginning, so bear in mind that n above included the low performers.
Upon testing the code, we realized that because of some nearly zero variables, the code
did not run as expected and MATLAB could not optimize the LP. To overcome this, we
have slightly changed the formulation to enable us to scale the inputs and outputs easily
so that MATLAB can handle the numbers. It is worth noting that our formulation is
units invariant so scaling up will not affect our actual results; it just helps the MATLAB
Chapter 7. Realization, Case Study and Results 122
program. For ease of formulation in MATLAB, we also learned that normalizing the
inputs and outputs by the use of standard deviation from the beginning worked better
than normalizing at the end, in the objective function. Although on paper this means
a simple change of variable Snew = Sold/std, which does not affect the final solution, in
practice, the former worked better with MATLAB.
7.2 Case Study
The case we have chosen is derived from real data by one of the major Canadian banks.
For our purpose, we have selected all urban branches (132 in total) in one province of
Canada. It should be emphasized that the goal of this chapter is not to evaluate bank
branches but rather to show the merits of the newly developed models.
7.2.1 Bank branch data: choice of model, inputs and outputs
For our case study, let us assume that we are given a task of evaluating 130+ urban
bank branches in a certain region, in terms of resource allocation and profitability. The
model evaluates the way a branch converts its expenses into revenues through its six
revenue-generating streams. The information could assist the regional manager to spot
the best practices and pinpoint the weaknesses of low performers. It could, for example
help to better allocate resources to gain more also in a certain line of business such as
home mortgages. This could be achieved by either efforts in attracting more mortgages,
recruiting better advisors, upgrading the online system or by better managing/investing
the funds/commitments already in place and perhaps other things in real practice.
The output metrics are the return rates on six major lines of business: everyday
banking, mortgages, commercial loans, commercial deposits, wealth management, and
consumer lending. Rate of return is a simple notion and readily makes sense to everyone.
Such information makes decision-making easier for management. Because the resources
Chapter 7. Realization, Case Study and Results 123
that each branch has are limited, they might be better off focusing on a few lines of
business they are good at to make more profit for the branch. We understand that
branches do not have the possibility of dropping out from any line of business, but upon
seeing the results, top management can decide on how to shift or add resources to better
deal with home mortgages, if that side of the business shows promise. The resources that
each branch has are: a combination of professional personnel and office staff (human
capital), which also provide size information, the equipment (IT hardware and software),
and the fixed assets as they are correlated highly with the status of the location of the
branch (economically affluent or deprived areas). Also, because of the importance of the
loan loss experience in the literature and in light of 2008 financial crisis, we included
the loan loss on the input side. This is to control the reward a branch might get for a
temporally high return rate on the loans, on the basis of taking high-risk clients. Input
metrics are merely the expenses related to the items mentioned. Summary of variables
are listed in Table 7.1.
In the following section, we will go over a few figures and compare the results of our
model to the traditional DEA additive model. We use the same inputs/outputs with the
traditional model, without any special treatment regarding the ratio variables. We also
include the artificial “low performer” DMU in the set. The traditional DEA additive
model will not generate a score: it just reports back the slacks. To make the comparison
fair, we transform the slacks into a score between zero and one for the traditional model
as well. For this purpose, we have normalized the slacks using the slacks for the low
performers, which obviously use the highest slacks and this will result in a score “zero”
for these units.
We would like to emphasize one more time that our goal here is just to show the
models’ merits rather than to solve a banking performance problem. The case study is
for illustration purposes only, and this is why we did not get ourselves into the selection
of right weights or imposing bounds for certain variables. We did not consult with the
Chapter 7. Realization, Case Study and Results 124
Table 7.1: Input and output variables, rev=revenue and res=resources
INPUTS (expenses): OUTPUTS (rate of return):
Personnel Expense Non-interest Earnings
Equipment Expense Consumer Deposit rev. Consumer Deposits res.
Fixed Assets Consumer Lending rev. Consumer Lending res.
Loan Losses Wealth Management rev. Wealth Management res.
Cross Charges Home Owner Mortgages rev. Home Owner Mortgages res.
Commercial Deposits rev. Commercial Deposits res.
Commercial Loans rev. Commercial Loans res.
Other Expense
management of any branch to set lower and upper bounds for a specific input or output.
The important thing is that we keep everything in both models identical so that we can
focus on their differences, based on the proposed formulation of ratios and linearization
techniques. It is worth adding that we also controlled for the units and translation
invariance effects as the traditional additive model is not units invariant. We normalized
each input and output for the traditional additive model to eliminate the effect of units
of measurement.
7.2.2 Comparing the proposed model against traditional addi-
tive model
To be able to compare the results, we also solved the same problem with a traditional
DEA package, EMS. As described above, the values were normalized before being fed
into the program because the additive model in its original form is not units invariant
Chapter 7. Realization, Case Study and Results 125
and an input or output might be favored just depending on the units of measurement.
The raw results from the EMS package were transformed to an efficiency score between
zero and one, using the highest slack which belongs to the artificial low performer. This
artificial low performer is a DMU which has the lowest level of every output and highest
in every input. The results are shown in figure 7.2.2, and they match our expectations.
Business profitability Average return rate jump Everyday Banking 3.72% Wealth Management 16.65% Home Mortgages 6.19% Consumer Lending 1.52% Commercial Deposits 0.04% Commercial Loans 3.48%
Expenses Average reduction (million $) Fixed Assets/Accruals -0.50 Loan Loss Experience -1.65 Employee Expense -2.70 Equipment Expenses -0.81 Other Losses -1.34 Cross Charges -1.80
0
0.2
0.4
0.6
0.8
1
1.2
0 20 40 60 80 100 120 140 160
Effic
ienc
y Sc
ore
DMUs' ID
Gap in potential for improvement
Traditional Additive Model New Additive Model
Figure 7.1: Comparing efficiency scores against traditional model
The traditional model fails to capture the right frontier and in several cases, misses the
improvement opportunity. The frontier defined by our model sits on top of the traditional
frontier and sets the bar higher than the traditional one. The artificial low performer, of
course, gets a zero score in both models.
In total, the difference between scores is about 0.93 in this example and the range is
from 0 to 33%. This means that for some DMUs the efficiency score calculated by our
Chapter 7. Realization, Case Study and Results 126
Table 7.2: Missed potential on savings at input side
Business profitability Average return rate jump Everyday Banking 3.72% Wealth Management 16.65% Home Mortgages 6.19% Consumer Lending 1.52% Commercial Deposits 0.04% Commercial Loans 3.48%
Expenses Average reduction (million $) Fixed Assets/Accruals -0.50 Loan Loss Experience -1.65 Employee Expense -2.70 Equipment Expenses -0.81 Other Losses -1.34 Cross Charges -1.80
0
0.2
0.4
0.6
0.8
1
1.2
0 20 40 60 80 100 120 140 160
Effic
ienc
y Sc
ore
DMUs' ID
Gap in potential for improvement
Traditional Additive Model New Additive Model
Table 7.3: Missed opportunity for higher return on revenue
Business profitability Average return rate jump Everyday Banking 3.72% Wealth Management 16.65% Home Mortgages 6.19% Consumer Lending 1.52% Commercial Deposits 0.04% Commercial Loans 3.48%
Expenses Average reduction (million $) Fixed Assets/Accruals -0.50 Loan Loss Experience -1.65 Employee Expense -2.70 Equipment Expenses -0.81 Other Losses -1.34 Cross Charges -1.80
0
0.2
0.4
0.6
0.8
1
1.2
0 20 40 60 80 100 120 140 160
Effic
ienc
y Sc
ore
DMUs' ID
Gap in potential for improvement
Traditional Additive Model New Additive Model
model is 33% lower than the efficiency score reported using the traditional DEA model.
This means there is a scope for those units to be better that has been overlooked. The
units could save 33% in expenses or generate 33% more or a combination of both. The
differences for all the units add up to 93%. Because the model collectively looks at input
reduction and output augmentation, how the efforts are devided between the two is not
readily readable from the final score. To see how this difference translates into better
use of resources and bettering performance in creating value from them (higher rates of
return), a closer look at inputs and outputs of the suggested targets is required to reveal
the net difference between the two models.
We calculated the difference in every input and output of every target (projection)
and summed them up in a general view. In some cases, our model has suggested the use
of more of a certain input compared to what EMS suggested, and this is because our
model focuses on redistribution of resources, which might mean using more in a certain
line of business that the branch is good at and less in another. However, considering all
Chapter 7. Realization, Case Study and Results 127
the inputs, overall, the targets found by our new model resulted in savings of an extra
153.4 million dollars. The details of suggested further reductions in every input is in
Table 7.2. On the output side, on average, the rate of return rates can improve by 5%,
the details are listed in Table 7.3. A closer look at the seven highly referenced DMUs in
the new model shows that all of them are fully efficient in the traditional model and they
also include all four popular DMUs in the construction of benchmarks, in the traditional
model.
The PPS remains the same if the convexity and variable returns to scale assumptions
stay unchanged. This means that the choice of model (additive or BCC) does not alter
the PPS. This is indeed a fact that could assist us in solving the nonlinear case presented
in Chapter 6.
7.3 Case study, nonlinear BCC Model: approxima-
tion method
In the case of the BCC model, where the ratio variables existed on the side of orientation,
the case could not be linearized and we were left with no option except a heuristic one
to approximate the frontier. The algorithm randomly generated unobserved parts of the
PPS, using a linear combination of existing DMUs and recording them if they turned out
to be outside of the traditional PPS.
We tested this algorithm on our case study of 130 urban branches. We tried out the
algorithm with a sample size of 50 and 300 sampling repetitions (sample from the observed
DMUs). For each sample, we tried out 100 different random weight combinations. In 11
runs of the algorithm, we found 124 DMUs above the traditional frontier, and overall,
this took 2.5 minutes. We also tried the algorithm with a sample size of 20 and 200
random weight sets with the same sampling rate. Repeating the procedure four times,
added 9 more unobserved DMUs and took 52 seconds on the Juno server. In the end, we
Chapter 7. Realization, Case Study and Results 128
decided that 133 unobserved DMUs is acceptable and stopped there. There is no golden
rule on when to stop; our decision was based on the quality of the unobserved DMUs
(outside the conventional envelopment form) and their spread (covering most parts of
the frontier, rather than clustering at one section). We need to stress that depending on
the results, one can run the algorithm only a few times, but there is no formula for the
number of times you need to run for an arbitrary number of DMUs. In our work, for
example in one run, the program found 84 unobserved DMUs, whereas with everything
staying the same as in the first run, we got nothing new. This is a random sampling and
it is not possible to predict the outcome for each trial. Having said that, on average,
the number of required trials is not large. We added these unobserved DMUs to the
0
0.2
0.4
0.6
0.8
1
1.2
133 153 173 193 213 233 253
Effic
ienc
y Sc
ore
DMUs' ID
0%
2%
4%
6%
8%
10%
12%
14%
0 20 40 60 80 100 120 140
Redu
ctio
n in
Eff
icie
ncy
Scor
e
DMUs' ID
Figure 7.2: drop/raise in the efficiency score of branches after adding unobserved DMUs
generated by the approximation method
original set and recalculated the efficiency score using an output oriented BCC model.
The average reduction in the score is 1.7% and this ranges, from zero to 11.5%. Therefore
our approximation model created a benchmark which is above the traditional one and
pushes the branches to aim for higher performance. This is not the whole story as there
Chapter 7. Realization, Case Study and Results 129
could exist DMUs that are labeled efficient in the traditional model, yet in reality, they
are operating at 88% efficiency, as we can see in figure 7.2. As seen in the Figure 7.3, the
0
0.2
0.4
0.6
0.8
1
1.2
133 153 173 193 213 233 253
Effic
ienc
y Sc
ore
DMUs' ID
0%
2%
4%
6%
8%
10%
12%
14%
0 20 40 60 80 100 120 140
Redu
ctio
n in
Eff
icie
ncy
Scor
e
DMUs' ID
Figure 7.3: Efficiency score of the unobserved DMUs generated by the approximation
method
unobserved DMUs mostly achieve an efficiency score of one, which is expected. However,
there are a few with a score below one. Recall that these units are reported as inefficient in
comparison with the other unobserved DMUs we added later. If we add each unobserved
DMU individually to the original data, that unit could achieve full efficiency, but this is
not necessarily the case when other unobserved DMUs are added as well to the original
data. Although every unobserved DMU is slightly outside the traditional envelopment,
it may need to work harder to get to the approximated frontier which is closer to the true
frontier. We know that the PPS for the BCC model and additive model is the same and
the difference in results lie in how we measure the distance to the frontier. The targets
identified by our linearized additive model can also be added to the original data set for
approximating the nonlinear BCC frontier. In this case, there was no significant benefit
Chapter 7. Realization, Case Study and Results 130
when we just used the efficient units from our non-oriented model to judge the quality of
the random unobserved DMUs. Another option is to include the efficient units from the
linearized additive model in the approximation algorithm to generate unobserved DMUs.
Chapter 8
Recommendations and future work
This chapter offers a synopsis of the work in this thesis and the contributions made, in
addition to the conclusions drawn from the results of the case study, in the previous
chapter. Opportunities for future work will also be described.
8.1 Contributions
We had a problem to address: using ratios as they are in the existing DEA models would
lead to distortions in the frontier. Consequently, this will misrepresent the opportunities
for improvement because of faulty targets. In this theoretical work, we have offered
solutions to these problem:
1. Developed two non-oriented models similar to Russell Graph Measure, and En-
hanced Russell Measure, which are modified to take in ratio variables.
2. Developed a new non-oriented model to deal with ratios, which satisfies almost all
of the desired properties we collected going through the models in the literature;
3. Reduced all the models to a linear case to make it work in practice; and
4. Proposed an approximation model to deal with the case of the BCC model with
131
Chapter 8. Recommendations and future work 132
ratios at the side on orientation, where the model could not be reduced to a linear
form.
Both developed models were tested on a small case of 130 urban bank branches of a
major Canadian bank in one Canadian province and the results proved the superiority
of our model to other existing techniques.
8.2 Discussion of the Results: Proposed model
Upon closer inspection of the results from the previous chapter, we realized that in
some instances the MATLAB optimization package has been unable to solve the LP to
the final optimal value (we can tell this by the value of the exit flag of the MATLAB
function). Having said that, MATLAB did report the best it could have achieved but
did not guarantee that it was optimum. In some instances, it was unable to find the
best weights, for example. This is not a surprise as the optimization algorithms are not
perfect and cannot avoid the degeneracy and cycling which means the algorithm does
not converge and becomes trapped in a loop [Gass 04]. This has happened for only 7%
of the cases. The DMUs achieved a reasonable score (between 0.7 and 0.9) but the
program could not decide where on the frontier they are best to target. This is not the
shortcoming of our model and this is down to the linear programming in general and
optimisation methods embedded in Matlab program in which we have no control. To
avoid any doubts, we excluded those instances from both models and we base the rest of
this discussion on the remaining 121 branches. We have also taken out the bad dummy
DMU to make sure the great improvement for the non-existing DMU does not inflate our
results.
Chapter 8. Recommendations and future work 133
8.2.1 Efficiency Scores
The efficiency scores reported are between zero and one, which makes the comparison
more convenient. The average efficiency score obtained through the proposed model was
85%, which is fairly consistent with the scores that are generally obtained from DEA
branch analyses. Banking is an established and profitable business and it is expected
that, on average, they operate reasonably well and are about 85% efficient [Akhi 03],
[McNu 05], [Fuen 03]. The additive DEA model with no consideration for ratios, reports
an average of 92% efficiency, which is an overestimation and does not provide good
discrimination. This is mainly beause the of the incorrect formulation due to ratio
variables.
Highly referenced DMUs in our proposed model were also marked as best performer
in the traditional DEA evaluations. Examining the characteristics of the highly refer-
enced DMUs could provide further insight into why they performed better than their
counterparts and would help management consider other factors pertinent to their better
performance and create guidelines for other units to emulate. In our dataset, we only
focused on one sector and region, and the specialty branches were excluded to have bal-
anced data and eliminate the risk of having outliers. Our goal was to test the models
rather than giving consultations to the management in their decision-making.
8.2.2 Direction of improvement
DEA not only identifies the inefficient units but also proposes a path on how to improve,
usually by setting a target which is a point on the frontier and is made up of one or a
combination of efficient units. Our proposed model has, as expected, set the bar higher
and as a result, has envisioned greater improvements for the inefficient units. Looking at
the absolute values of the targets, the added savings suggested by our model is shown in
the Table 8.1. In the additive model and with the variable returns to scale assumption,
Chapter 8. Recommendations and future work 134
Table 8.1: Further savings on inputs (million $)
Fixed Assets/Accruals
Loan Loss Experience
Employee Expense
Equipement Expenses
Other Losses
Cross Charges
‐14.90 ‐13.36 ‐66.95 ‐10.21 ‐6.33 ‐22.03
it is possible that the model advises getting some more of the inputs if, by comparison,
it finds that more input will make the unit more efficient due to economies of scale. Of
course, the use of weights and cone ratios can help us achieve more attainable targets;
the tolerances and lower and upper bounds for each variable need to be developed in
close relationship with management. Our goal here was to merely test the models and
not to aid the decision-making.
8.3 Discussion of the results: approximation method
None of the branches were left without a direction of improvement in our approximation
method so we are basing our discussion on the 130 branches with which the algorithm has
begun. The average efficiency score of the unobserved DMUs is 96% with the majority
being one and a few identified as inefficient. We aimed to find unobserved DMUs which
could take the frontier to the next level and expected most of them to attain an efficiency
score of one. We need to bear in mind that the selection of the best hypothetical DMUs
in any round depends on what data we have accumulated until that point and we cannot
predict future rounds. As a result, we may end up with having a few unobserved DMUs
that will turn out inefficient compared to the other unobserved DMUs we find. The
efficiency score average after the update of the PPS is 93%, which is high but it shows
improvement over the 95% average obtained using the traditional model. When compared
with the proposed model efficiency average, 87%, we can see the gap between our estimate
and the optimal reality. With more repetitions or alteration of sample size and weight
sets, results could be improved one more level. The average might sound discouraging
Chapter 8. Recommendations and future work 135
but with a closer look, we can identify that some units have an 11% drop in the efficiency
score, which is equivalent to an opportunity for furthur improvement that was masked
before because of a false estimate of the frontier.
8.4 Recommendation, limitations and future direc-
tions
Our work involved the development of two very different models to enable the ratio vari-
ables in DEA and, more importantly, make the models work computationally. The goal
has been to properly define the PPS and reduce the nonlinear form to a linear program
as well as proposing a heuristic method to solve nonlinear cases. Overall, we examined
all existing models and identified the desirable characteristics that we eventually wanted
our model to accommodate, while identifying the PPS correctly. In total, more than
20 models were thoroughly examined and a methodology was developed on how to de-
fine/develop new models that could be used for creating other models as well. Our final
models were tested on a simple yet powerful, case study with real data. The results
confirm the applications of our models. We also found that our method in reducing the
nonlinear format to a linear format could be applicable to the popular Russell model,
and could make it more accessible to practitioners. Our models were able to find the
best practices and showed good discriminatory power.
We were not concerned much with the practicality of the targets. If found unrealistic,
it is not difficult to add a few weight restrictions as new constraints to our LP in the
proposed model to fix this issue. For the approximation model, an extra step can be
added after random number generation so the new points in the PPS gets checked against
constraints on the variables.
We proposed some heuristic rules for convergence criterion in our approximation
method. Heuristic rules are not set in stone and there is always room to devise new
Chapter 8. Recommendations and future work 136
forms with quicker convergence or more deterministic procedures.
Due to the complexity of using ratio variables and the need to have the information
about the numerator and the denominator, it is advisable that, whenever possible, normal
data or some proxy measures to be used. To report back to management, a desirable
ratio format could be reconstructed from normal data and proxies (to some extent). Our
method is to be used only if the data is fully available and the use of ratio variables will
result in better targets and insights.
In computer programming, overflow occurs when an arithmetic operation attempts to
create a numeric value that is too large to be stored in the available space. For instance,
for calculating the average of some numbers, as done in many search algorithms, first
data values are added up then divided by the number of data points. This causes error or
unexpected results if the sum, (not necessarily the resulting mean)is too large. Similarly
underflow can happen where the result of a calculation is smaller than the smallest value
defined in the package. For instance a small number in the denominator might be saved
as zero in the computer and hence generates an error when the value of the fraction is
to be accessed. In such instances the code does not perform as it was expected. We
faced underflow issue and we multiplied our numbers by factors of 100 and 1000 and,
of course, at the end, rescaled them back to be in the same range as the actual data.
One must check for the exit flags in the optimization algorithm in MATLAB to make
sure that the results are indeed optimum. There could be some cases, as we experienced,
where a final optimum solution to the objective function, though defined, could not be
attained. Often, an ad-hoc remedy could be achieved by changing input/output variables
by one tenth of a percentage point. Overall, depending on the data at hand, customized
adjustments might be required.
Several areas for future development arise from this work in the area of embedding
ratio variables in DEA models, and the opportunities are mainly associated with im-
plementation techniques of the models developed. These further directions of research
Chapter 8. Recommendations and future work 137
include:
• Extending the computer codes to be able to adjust variable ranges automatically,
and adding various checkpoints at the appropriate steps to avoid non-convergence
and impossible solutions, while using the MATLAB optimization tool.
• Providing solutions for the DMUs for which the optimization toolbox is unable to
find the optimum.
• Implementing the reduced model in different coding languages or software pack-
ages that might have better capabilities and making it possible to integrate the
model into existing DEA software packages or making it an add-on to the existing
commercial packages.
• There is always room and scope to create a better heuristic when it comes to
non-optimal and non-deterministic solutions such as our approximation models.
• Alongside the implementation routes suggested above, one other non-technical as-
pect, which might revive the use of ratios in DEA and broaden the DEA market,
is to use non-ratio data but then rework the presentation and craft results into
popular ratio forms.
Chapter 8. Recommendations and future work 138
Additive Model DEA model which has no orientation and measures ef-
ficiency maximizing both the input and output slacks
simultaneously.
BCC or VRS Model DEA model which assumes a variable returns to scale
relationship between inputs and outputs.
CCR or CRS Model DEA model which assumes a constant returns to scale
relationship between inputs and outputs.
Ray Unboundedness Scaling up or down of any realized DMU generates a
new feasible DMU.
CRS Constant Returns to Scale A measure where a proportionate increase in inputs
results in an identical proportionate increase in out-
puts.
VRS Variable Returns to scale A measure where a proportionate increase in inputs
does not result in an identical proportionate increase
in outputs.
Convexity An axiom that requires the the multipliers are
summed up to one, when creating linear combination
of DMUs.
Input-Oriented Model DEA model whose objective is to minimize inputs
while keeping outputs constant.
DEA Data Envelopment Analysis A non-parametric, linear programming technique
used for measuring the relative efficiency of
units,considering multiple inputs and outputs simul-
taneously.
Chapter 8. Recommendations and future work 139
RA Ratio Analysis A technique that uses the ratio of a single output to
a single input and generates a relative efficiency score
by dividing the aforesaid ratio by the corresponding
“best performers” ratio, on this specific ratio defini-
tion.
DMU Decision Making Unit Term used to describe a unit under study such as
bank branch, hospital, firm, etc.
Free Disposability axiom An assumption that says if DMUi is feasible then any
DMU that is doing worse, producing less or consum-
ing more, can be realized too.
Efficient Frontier The facets and edges of the PPS, representing the
most efficient units.
Output-Oriented DEA model whose objective is to maximize outputs
while keeping inputs constant.
Reference Group Set of efficient units to which the inefficient unit has
been most directly compared when calculating its ef-
ficiency rating in DEA.
PPS Production Possibility Set Given the observed data, the set of all possible in-
put/output combinations that could exist.
Profitability Efficiency Model DEA Model that captures the business operations of
a bank branch using revenues ratios as outputs and
branch expenses as inputs.
Full Efficiency Full efficiency is attained by any DMU if and only if
none of its inputs or outputs can be improved without
worsening some of its other inputs or outputs.
Chapter 8. Recommendations and future work 140
Full Relative or Technical efficiency Full technical efficiency is attained by any DMU if and
only if, compared to other observed DMUs, none of its
inputs or outputs can be improved without worsening
some of its other inputs or outputs.
Technical change It is the relative efficiency of the entity when com-
pared to a broader or newer peer groups.
Scale efficiency Scale efficiency represents the failure in achieving the
most productive scale size and is the difference be-
tween CRS and VRS models.
Input Slack factor Identifies how much one of the inputs can be reduced
without changing other inputs or outputs.
Input substitution factor Identifies the smallest value for one specific input
among the DMUs belonging to the PPS.
Output Slack factor Identifies how much one of the outputs can be in-
creased without changing other outputs or inputs.
Output substitution factor Identifies the largest value for one specific output
among the DMUs belonging to the PPS.
FDH Free Disposal Hull assumption adds to the observed
production data, the unobserved production points
with output levels equal to or lower than those of
some observed points and more of at least one input;
or with input levels equal to or higher than those of
some observed points and less of at least one output.
Partial m Frontier A method for forming the frontier that does not im-
pose convexity on the production set and allows for
noise (with zero expected values) and as a result is
less sensitive to outliers.
Chapter 8. Recommendations and future work 141
Quantile Frontier A continuous version of partial m method to form the
frontier and more robust to the presence of outliers.
Proposed Non-oriented Model A modified additive DEA model presented in this
work which imposes convexity on the DMUs when
ratio variables are involved and it is units and trans-
lation invariant. The efficiency score is between zero
and one.
LP Linear programming A method to achieve the best outcome (such as max-
imum output or minimum input) in a mathematical
model whose requirements are represented by linear
relationships.
Bootstrapping Bootstrapping is a re-sampling technique to approx-
imate the distribution of a random variable in order
to estimate a specific statistic of the population.
Monte Carlo A broad class of computational algorithms that rely
on repeated random sampling to obtain numerical re-
sults.
MCMC Markov Chain Monte Carlo
Methods
A class of algorithms for sampling from a probability
distribution based on constructing a Markov chain
that has the desired distribution as its equilibrium
distribution.
Sparse matrix In numerical analysis, a sparse matrix is a matrix in
which most of the elements are zero.
EMS Efficiency Measurement Sys-
tem
A Data Envelopment Analysis (DEA) Software by
Holger Scheel.
Chapter 8. Recommendations and future work 142
MATLAB (matrix laboratory) It is a multi-paradigm numerical computing environ-
ment and fourth-generation programming language
developed by Mathworks.
References
[Aida 98] Aida, K., Cooper, W., Pastor, J., and Sueyoshi, T. “Evaluating water sup-
ply services in Japan with RAM a range adjusted measure of inefficiency”.
OMEGA: International Journal of Management Science, Vol. 26, pp. 207–
232, 1998.
[Akhi 03] Akhigbe, A. and McNulty, J. E. “The profit efficiency of small US commercial
banks”. Journal of Banking and Finance, Vol. 27, pp. 307–325, 2003.
[Alex 10] Alexander, W. R. J., Haug, A. A., and Jaforullah, M. “A two-stage double-
bootstrap data envelopment analysis of efficiency differences of New Zealand
secondary schools”. Journal of Productivity Analysis, Vol. 34, No. 2, pp. 99–
110, 2010.
[Ali 93] Ali and Seiford, L. “Computational accuracy and infinitesimals in Data
Envelopement Analysis”. Infor, Vol. 31, pp. 290–297, 1993.
[Ande 93] Andersen, P. and Petersen, N. C. “A Procedure for Ranking Efficient Units in
Data Envelopment Analysis”. Management Science, Vol. 39, pp. 1261–1264,
1993.
[Apar 07] Aparicio, J., Ruiz, J. L., and Sirvent, I. “Closest targets and minimum dis-
tance to the pareto-efficientfrontier in DEA”. Journal of Productivity Anal-
ysis, Vol. 28, pp. 209–218, 2007.
143
References 144
[Arag 05] Aragon, Y., Daouia, A., and Thomas-Agnan, C. “Nonparametric frontier
estimation: A conditional quantile-based approach”. Econometric Theory,
Vol. 21, No. 2, pp. 358–389, 2005.
[Asmi 10] Asmild, M. and Pastor, J. “Slack free MEA and RDM with comprehensive
efficiency measure”. OMEGA: International Journal of Management Science,
Vol. 38, pp. 475–483, 2010.
[Bank 84] Banker, R., Charnes, A., and Cooper, W. “Models for the estimation of tech-
nical and scale inefficiencies in Data Envelopment Analysis”. Management
Science, Vol. 30, No. 9, pp. 1078–1092, 1984.
[Bank 87] Banker, R., Charnes, A., Cooper, W., and Maindiratta, A. “A comparison
of data envelopment analysis and translog estimates of production frontiers
using simulated observations from a known technology”. Applications in
Modern Production Theory Inefficiency and Productivity, 1987.
[Bank 93] Banker, R., Gadh, V., and Gorr, W. “A Monte Carlo comparison of two
production frontier estimation methods: corrected ordinary least squares
and data envelopment analysis”. European Journal of Operational Research,
Vol. 67, 1993.
[Bous 09] Boussemart, J.-P. and Leleu, H. “Measuring potential gains from specializa-
tion under non-convex technologies”. IESEG School of Management Working
Papers, No. 2, 2009.
[Bowl 04] Bowlin, W. F. “Financial analysis of civil reserve air fleet participants us-
ing data envelopment analysis”. European Journal of Operational Research,
Vol. 154, pp. 691–709, 2004.
References 145
[Broc 98] Brockett, P., Cooper, W., Shin, H., and Wang, Y. “Inefficiency and Conges-
tion in Chinese Production Before and After the 1978 Economic Reforms”.
Socio-Economic Planning Sciences, Vol. 32, pp. 1–20, 1998.
[Caza 02] Cazals, C., Florens, J. P., and Simar, L. “Nonparametric frontier estimation:
a robust approach.”. Journal of Econometrics, Vol. 106, pp. 1–25, 2002.
[Cham 96] Chambers, R. G., Chung, Y., and Fare, R. “Benefit and distance functions”.
Journal of Economic theory, Vol. 70, 1996.
[Char 78] Charnes, A., Cooper, W. W., and Rhodes, E. “Measuring the efficiency of
decision making units”. European Journal of Operational Research, Vol. 2,
pp. 429–444, 1978.
[Char 81] Charnes, A., Cooper, W. W., and Rhodes, E. “EVALUATING PROGRAM
AND MANAGERIAL EFFICIENCY: with an illustrative APPLICATION
to the PROGRAM FOLLOW THROUGH experiment in U.S. public school
education”. Management Science, Vol. 27, pp. 668–697, 1981.
[Char 82] Charnes, A., Cooper, W., Seiford, L., and Strutz, J. “A multiplicative model
for efficiency analysis”. Socio-Economic Planning Sciences, Vol. 16, No. 5,
pp. 223–224, 1982.
[Char 83] Charnes, A., Cooper, W., Seiford, L., and Strutz, J. “Invariant Multiplicative
efficiency and piecewise Cobb-Duglas Envelopements”. Operations Research
Letters, Vol. 2, No. 3, pp. 101–103, 1983.
[Char 85] Charnes, A., Cooper, W., Golany, B., Seiford, L., and Strutz, J. “Founda-
tions of data envelopment analysis for Pareto-Koopmans efficient empirical
production functions”. Journal of Economics, Vol. 30, pp. 91–107, 1985.
References 146
[Char 87] Charnes, A., Cooper, W., Rousseau, J., and Semple, J. “Data Envelopment
Analysis and Axiomatic notions of efficiency and reference sets”. Tech. Rep.,
Center for Cybernetic studies, the university of texas, austin, 1987.
[Chen 02a] Chen, Y. and Ali, A. I. “Output-input ratio analysis and DEA frontier”.
European Journal of Operational Research, Vol. 142, pp. 476–479, 2002.
[Chen 02b] Chen, Y. and Ali, A. I. “Output-input ratio analysis and DEA frontier”.
European Journal of Operational Research, Vol. 142, pp. 476–479, 2002.
[Chen 07] Chen, W.-C. and McGinnis, L. F. “Reconciling ratio analysis and DEA as
performance assessment tools”. European Journal of Operational Research,
Vol. 178, pp. 277–291, 2007.
[Chen 14] Chen, K. and Kou, M. “Weighted Additive DEA models associated with
dataset standardization techniques”. Tech. Rep., Chinese Academy of sci-
ences, 2014.
[Cher 01] Cherchye, L., Kuosmanen, T., and Post, G. T. “Alternative Treatments of
Congestion in DEA: A rejoinder to Cooper, Gu, and Li”. European Journal
of Operational Research, Vol. 132, No. 1, pp. 75–80, 2001.
[Cher 99] Cherchye, L., Kuosmanen, T., and Post, T. “Why convexify ? An assessment
of convexity axioms in DEA”. Helsinki School of Economics and Business
Administration Working Papers, 1999.
[Chun 97] Chung, Y. H., Fare, R., and Grosskopf, S. “Productivity and undesirable out-
puts: A directional distance function approach”. Journal of Environmental
Management, Vol. 51, 1997.
[Conc 03] Conceicao, M., Portela, A. S., Borges, P. C., and Thanassoulis, E. “Finding
closest targets in non-oriented Data Envelopment Analysis models: the case
References 147
of convex and non-convx technologies”. Journal of Productivity Analysis,
Vol. 19, 2003.
[Cook 14] Cook, W. D., one, K., and Zhu, J. “Data Envelopment Analysis: Prior to
choosing a model”. Omega International, Journal of Management Science,
Vol. 44, pp. 1–4, 2014.
[Coop 01a] Cooper, W. W., Gu, B., and Li, S. “Comparison and evaluation of alternative
approaches to the Treatments of Congestion in DEA”. European Journal of
Operational Research, Vol. 132, No. 1, pp. 62–74, 2001.
[Coop 01b] Cooper, W. W., Gu, B., and Li, S. “Note: Alternative Treatments of Con-
gestion in DEA- a response to the Cherchye, Kuosmanen and Post critique”.
European Journal of Operational Research, Vol. 132, No. 1, pp. 81–87, 2001.
[Coop 04] Cooper, W. W., Seiford, L. M., and Zhu, J. Handbook on data envelopment
analysis. Kluwer Academic Publishers, 2004.
[Coop 11] Cooper, W. W., Pastor, J. T., Borras, F., and Pastor, A. D. “BAM: a
bounded adjusted measure of efficiency for use with bounded additive mod-
els”. Journal of Productivity Analysis, Vol. 35, 2011.
[Coop 95] Cooper, W. W. and Pastor, J. T. “Global Efficiency Measurement in DEA”.
Working paper, Depto Este Inv. Oper. Universidad Alicante, Alicante, Spain.,
1995.
[Coop 99a] Cooper, W., Park, K. S., and Pastor, J. “RAM: A Range Adjusted Measure
of Inefficiency for Use with Additive Models, and Relations to Other Models
and Measures in DEA”. Journal of Productivity Analysis, Vol. 11, pp. 5–42,
1999.
References 148
[Coop 99b] Cooper, W., Park, K. S., and Yu, G. “IDEA and AR-IDEA: Models for
dealing with imprecise data in DEA”. Management Science, Vol. 45, No. 4,
pp. 597–607, 1999.
[Cron 02] Cronje, J. J. L. “Data Envelopment Analysis as a measure for technical
efficiency measurement in banking - a research framework”. Southern African
Business Review, Vol. 6, No. 2, pp. 32–41, 2002.
[Daou 07] Daouia, A. and Simar, L. “Nonparametric Frontier estimation: A Multi-
variate Conditional Quantile Approach”. Journal of Econometrics, Vol. 140,
No. 2, pp. 375–400, 2007.
[Depr 84] Deprins, D., Simar, L., and Tulkens, H. “Measuring labor-efficiency in post
offices”. In: Marchand, M., Pestieau, P., and Tulkens, H., Eds., The perfor-
mance of public enterprises: concepts and measurement, Amsterdam, North-
Holland, 1984.
[Desp 07] Despic, O., Despic, M., and Paradi, J. C. “DEA-R: ratio-based comparative
efficiency model, its mathematical relation to DEA and its use in applica-
tions”. Journal of Productivity Analysis, Vol. 28, pp. 33–44, 2007.
[Dyso 10] Dyson, R. G. and Shale, E. “Data envelopment analysis, operational research
and uncertainty”. Journal of the Operational Research Society, Vol. 61, No. 1,
pp. 25–34, 2010.
[Efro 79] Efron, B. “Bootstrap methods: another look at the jackknife”. The annals
of Statistics, pp. 1–26, 1979.
[Efro 82] Efron, B. and Efron, B. The jackknife, the bootstrap and other resampling
plans. Vol. 38, SIAM, 1982.
References 149
[Efro 94] Efron, B. and Tibshirani, R. J. An introduction to the bootstrap. CRC press,
1994.
[Emro 08] Emrouznejad, A., Parker, B. R., and Tavares, G. “Evaluation of research
in efficiency and productivity: A survey and analysis of the first 30 years of
scholarly literature in DEA”. Socio-Economic Planning Sciences, 2008.
[Emro 09] Emrouznejad, A. and Amin, G. R. “DEA models for ratio data: Convexity
consideration”. Applied Mathematical Modeling, Vol. 33, No. 1, pp. 486–498,
2009.
[Fare 00] Fare, R. and Grosskopf, S. “Theory and application of directional distance
functions”. Journal of Productivity Analysis, Vol. 2, pp. 93–104, 2000.
[Fare 10a] Fare, R. and Grosskopf, S. “Directional distance functions and slacks-based
measures of efficiency”. European Journal of Operational Research, Vol. 200,
pp. 320–322, 2010.
[Fare 10b] Fare, R. and Grosskopf, S. “Directional distance functions and slacks-based
measures of efficiency: Some clarifications”. European Journal of Operational
Research, Vol. 206, p. 702, 2010.
[Fare 78] Fare, R. and Lovell, C. “Measuring the Technical Efficiency of Production”.
Journal of Economic Theory, Vol. 19, pp. 150–162, 1978.
[Fare 83a] Fare, R. and Grosskopf, S. “Measuring Congestion in Production”.
Zeitschrift fur Nationalokonomie, Vol. 43, pp. 257–271, 1983.
[Fare 83b] Fare, R., Lovell, C. A. K., and Zieschang, K. “Measuring the Technical
Efficiency of Multiple Output Production Technologies”. In: Eichhorn, W.,
Henn, R., Neumann, K., and Sheppard, R. W., Eds., Quantitative Studies
on Production and Prices, Springer, Wien, 1983.
References 150
[Fare 85] Fare, R., Grosskopf, S., and Lovell, C. The Measurement of Efficiency of
Production. Boston: Kluwer-Nijhoff Publishing, 1985.
[Farr 57] Farrell, M. “The measurement of productive efficiency”. Journal of Royal
Statistical Society, Series A, Vol. 120, No. 3, pp. 253–281, 1957.
[Farr 59] Farrell, M. “Convexity assumption in theory of competitive markets”. Jour-
nal of Political Economy, Vol. 67, 1959.
[Fero 03] Feroz, E. H., Kim, S., and Raab, R. L. “Financial statement analysis: A
Data Envelopment Analysis approach”. Journal of the Operational Research
Society, Vol. 54, pp. 48–58, 2003.
[Ferr 97] Ferrier, G. D. and Hirschberg, J. G. “Bootstrapping confidence intervals
for linear programming efficiency scores: With an illustration using Italian
banking data”. Journal of Productivity Analysis, Vol. 8, No. 1, pp. 19–33,
1997.
[Ferr 99] Ferrier, G. D. and Hirschberg, J. G. “Can we bootstrap DEA scores?”.
Journal of Productivity Analysis, Vol. 11, No. 1, pp. 81–92, 1999.
[Frei 99] Frei, F. X. and Harker, P. T. “Projections onto efficient frontiers: Theoret-
ical and conceptual extensions to DEA”. Journal of Productivity Analysis,
Vol. 11, No. 3, pp. 275–300, 1999.
[Fuen 03] Fuentes, R. and Vergara, M. “Explaining Bank Efficiency: Bank Size or
Ownership Structure?”. Proceedings of the VIII Meeting of the Research
Network of Central Banks of the Americas, 2003.
[Fuku 09] Fukuyama, H. and Weber, W. “A directional slack based measure of technical
inefficiency”. Socio-Economic Planning Sciences, Vol. 43, pp. 274–287, 2009.
References 151
[Gall 03] Gallagher, T. J. and Andrew, J. D. Financial Management. Freeload Press
Ltd., 2003.
[Gass 04] Gassa, S. I. and Vinjamuri, S. “Cycling in linear programming problems”.
Computers and Operations Research, Vol. 31, pp. 303–311, 2004.
[Gema 84] Geman, S. and Geman, D. “Stochastic relaxation, Gibbs distributions, and
the Bayesian restoration of images”. Pattern Analysis and Machine Intelli-
gence, IEEE Transactions on, No. 6, pp. 721–741, 1984.
[Gong 92] Gong, B. and Sickles, R. “Finite sample evidence on the performance of
stochastic frontiers and data envelopment analysis using panel data”. Journal
Economet, Vol. 51, 1992.
[Gonz 07] Gonzalez-Bravo, M. I. “Prior-Ratio-Analysis procedure to improve data en-
velopment analysis for performance measurement”. Journal of the Opera-
tional Research Society, Vol. 58, pp. 1214–1222, 2007.
[Gree 97] Green, R. H., Cook, W., and Doyle, J. “A Note on the Additive Data
Envelopment Analysis Model”. The Journal of the Operational Research
Society, Vol. 486, No. 4, pp. 446–448, 1997.
[Grif 99] Grifell-Tatje, E. and Lovell, C. “Profits and productivity”. Management
Science, Vol. 45, No. 9, pp. 1177–1193, 1999.
[Gsta 03] Gstach, D. “A Statistical Framework for Estimating Output-Specific Effi-
ciencies”. 2003.
[Gsta 95] Gstach, D. “Comparing structural efficiency of unbalanced subsamples: A
resampling adaptation of data envelopment analysis”. Empirical Economics,
Vol. 20, No. 3, pp. 531–542, 1995.
References 152
[Hast 70] Hastings, W. K. “Monte Carlo sampling methods using Markov chains and
their applications”. Biometrika, Vol. 57, No. 1, pp. 97–109, 1970.
[Holl 03] Hollingsworth, B. and Smith, P. “Use of ratios in data envelopment analysis”.
Applied Economics Letters, Vol. 10, pp. 733–735, 2003.
[Holl 06] Hollo, D. and Nagy, M. “Bank Efficiency in the Enlarged European Union”.
MNB working papers, 2006.
[Knei 03] Kneip, A., Simar, L., Wilson, P. W., et al. “Asymptotics for DEA estimators
in nonparametric frontier models”. Tech. Rep., Discussion paper, 2003.
[Kriv 08] Krivonozhko, V. E., Utkin, O. B., Safin, M. M., and Lychev, A. V. “On
comparison of the Ratio Analysis and the DEA approach in financial area”.
Conference on the Uses of Frontier Efficiency Methodologies for Performance
Measurement in the Financial Services Sector, 2008.
[Levk 12] Levkoff, S. B., Russell, R. R., and Schworm, W. “Boundary problems with
the Russell graph measure of technical efficiency: a refinement”. Journal of
Productivity Analysis, Vol. 37, No. 3, pp. 239–248, 2012.
[Lewi 97] Lewin, A. Y. and Seiford, L. M. “Extending the frontiers of Data Envelop-
ment Analysis”. Annals of Operations Research, Vol. 73, No. 0, pp. 1–11,
1997.
[Loth 98] Lothgren, M. “How to bootstrap DEA estimators: a Monte Carlo compari-
son”. WP in Economics and Finance, No. 223, 1998.
[Loth 99] Lothgren, M. and Tambour, M. “Bootstrapping the data envelopment anal-
ysis Malmquist productivity index”. Applied Economics, Vol. 31, No. 4,
pp. 417–425, 1999.
References 153
[Love 95a] Lovell, C. K., Pastor, J. T., and Turner, J. A. “Measuring macroeconomic
performance in the OECD: A comparison of European and non-European
countries”. European Journal of Operational Research, Vol. 87, No. 3,
pp. 507–518, 1995.
[Love 95b] Lovell, C. K. and Pastor, J. “Units Invariant and Translation invariant DEA
models”. Operations Research Letters, Vol. 18, 1995.
[Luen 92] Luenberger, D. “Benefit functions and duality”. Journal of Mathematical
Economics, Vol. 21, 1992.
[McNu 05] McNulty, J. “Profit Efficiency Sources and Differences among Small and
Large U.S. Commercial Banks”. Journal of Economics and Finance, 2005.
[Metr 49] Metropolis, N. and Ulam, S. “The monte carlo method”. Journal of the
American statistical association, Vol. 44, No. 247, pp. 335–341, 1949.
[Metr 53] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and
Teller, E. “Equation of state calculations by fast computing machines”. The
journal of chemical physics, Vol. 21, No. 6, pp. 1087–1092, 1953.
[Naja 05] Najadat, H., Nygard, K. E., and Schesvold, D. “Clustering-Based Method
for Data Envelopment Analysis.”. In: MSV, pp. 255–264, 2005.
[Oles 03] Olesen, O. and Petersen, N. “Identification and use of efficiet faces and facets
in DEA”. Journal of Productivity Analysis, Vol. 20, pp. 323–360, 2003.
[Park 00] Park, B., Simar, L., and Weiner, C. “The FDH estimator for productivity
efficiency scores”. Econometric Theory, Vol. 16, pp. 855–877, 2000.
[Past 13a] Pastor, J., Aparicio, J., Monge, J., and Pastor, D. “Modeling CRS bounded
additive DEA models and characterizing their Pareto-efficient points”. Jour-
nal of Productivity Analysis, Vol. 40, No. 3, pp. 285–292, 2013.
References 154
[Past 13b] Pastor, J., Aparicio, J., Monge, J., and Pastor, D. “Modeling CRS bounded
additive DEA models and characterizing their Pareto-efficient points”. Jour-
nal of Productivity Analysis, Vol. 40, No. 3, pp. 285–292, 2013.
[Past 96] Pastor, J. “Chapter 3 Translation invariance in data envelopment analysis:
A generalization”. Annals of Operations Research, Vol. 66, No. 2, pp. 91–102,
1996.
[Past 99a] Pastor, J. T., Ruiz, J. L., and Sirvent, I. “An enhanced DEA Russell graph
efficiency measure”. European Journal of Operational Research, Vol. 115,
pp. 596–607, 1999.
[Past 99b] Pastor, J., Ruiz, J., and Sirvent, I. “An enhanced DEA Russell graph effi-
ciency measure”. European Journal of Operational Research, Vol. 115, 1999.
[Pate 00] Paterson, I. “New Models for Data Envelopment Analysis Measuring Effi-
ciency Out with the VRS Frontier”. 2000.
[Port 02] Portela, A. S. and Thanassoulis, E. “Profit efficiency in DEA”. Darmstadt
Discussion Papers in Economics , Aston Business School, 2002.
[Port 04] Portela, M. C. A. S., Thanassoulis, E., and Simpson, G. “Negative Data in
DEA, A directional distance approach applied to bank branches”. Journal
of the Operational Research Society, Vol. 55, pp. 1111–1121, 2004.
[Port 07] Portela, M. and Thanassoulis, E. “Developing a decomposable measure of
profit efficiency using DEA”. Journal of the Operational Research Society,
Vol. 58, No. 4, pp. 481–490, 2007.
[Ray 00] Ray, S. C. Data Envelopment Analysis: Theory and Techniques for Eco-
nomics and Operations Research. Kluwer Academic Publishers, 2000.
References 155
[RGDy 01] R.G.Dyson, Allen, R., Camanho, A., Podinovski, V., Sarrico, C., and Shale,
E. “Pitfalls and protocols in DEA”. European Journal of Operational Re-
search, pp. 245–259, 2001.
[Rolf 89] Rolf, F., Grosskopf, S., Lovell, C., and Pasurka, C. “Multilateral Produc-
tivity Comparisons When Some Outputs Are Undesirable: A Nonparametric
Approach”. Review of Economics and Statistics, Vol. 71, pp. 90–98, 1989.
[Rugg 98] Ruggieroa, J. and Bretschneider, S. “The weighted Russell measure of tech-
nical efficiency”. European Journal of Operational Research, Vol. 108, No. 2,
pp. 438–451, 1998.
[Sadj 10] Sadjadi, S. and Omrani, H. “A bootstrapped robust data envelopment analy-
sis model for efficiency estimating of telecommunication companies in Iran”.
Telecommunications Policy, Vol. 34, No. 4, pp. 221–232, 2010.
[Shar 07] Sharp, J. A., Meng, W., and Liu, W. “A Modified Slacks-based measure
model for data envelopement analysis with natural negative outputs and
inputs”. The Journal of the Operational Research Society, Vol. 58, 2007.
[Shep 70] Sheppard, R. W. Theory of cost and production. Princeton University Press,
1970.
[Siga 09] Sigaroudi, S. Incorporating Ratios in DEA-An application to real data. Mas-
ter’s thesis, The University of Toronto, 2009.
[Sima 00] Simar, L. and Wilson, P. W. “A general methodology for bootstrapping in
non-parametric frontier models”. Journal of applied statistics, Vol. 27, No. 6,
pp. 779–802, 2000.
[Sima 08] Simar, L. and Wilson, P. W. “Statistical inference in nonparametric fron-
tier models: recent developments and perspectives”. The Measurement of
References 156
Productive Efficiency (H. Fried, CAK Lovell and SS Schmidt Eds), Oxford
University Press, Inc, pp. 421–521, 2008.
[Sima 98] Simar, L. and Wilson, P. W. “Sensitivity analysis of efficiency scores: How to
bootstrap in nonparametric frontier models”. Management science, Vol. 44,
No. 1, pp. 49–61, 1998.
[Sima 99a] Simar, L. and Wilson, P. W. “Estimating and bootstrapping Malmquist
indices”. European Journal of Operational Research, Vol. 115, No. 3, pp. 459–
471, 1999.
[Sima 99b] Simar, L. and Wilson, P. W. “Of course we can bootstrap DEA scores!
But does it mean anything? Logic trumps wishful thinking”. Journal of
Productivity Analysis, Vol. 11, No. 1, pp. 93–97, 1999.
[Sima 99c] Simar, L. and Wilson, P. W. “Some problems with the Ferrier/Hirschberg
bootstrap idea”. Journal of Productivity Analysis, Vol. 11, No. 1, pp. 67–80,
1999.
[Sowl 04] Sowlati, T. and Paradi, J. C. “Establishing the “practical frontier” in data
envelopment analysis”. Omega, The International Journal of Management
Science, Vol. 32, pp. 261–272, 2004.
[Stei 01] Steinmann, L. and Zweifel, P. “The Range Adjusted Measure (RAM) in
DEA: Comment”. Journal of Productivity Analysis, Vol. 15, No. 2, pp. 139–
144, 2001.
[Than 12] Thanassoulis, E., Kortelainen, M., and Allen, R. “Improving envelopment
in data envelopment analysis under variables returns to scale”. European
Journal of Operational Research, Vol. 218, 2012.
References 157
[Than 92] Thanassoulis, E. and Dyson, R. “Estimating preferred target input-output
levels using data envelopment analysis”. European Journal of Operational
Research, Vol. 56, 1992.
[Than 96] Thanassoulis, E., Boussofiane, A., and Dyson, R. “A Comparison of Data
Envelopment Analysis and Ratio Analysis as Tools for Performance Assess-
ment”. Omega International, Journal of Management Science, Vol. 24, No. 3,
pp. 229–244, 1996.
[Tone 01] Tone, K. “A slacks-based measure of efficiency in data envelopment analysis”.
European Journal of Operational Research, Vol. 130, 2001.
[Tone 99] An extensions of the two Phase Process in CCR model, 1999.
[Tsio 03] Tsionas, E. G. “Combining DEA and stochastic frontier models: An empir-
ical Bayes approach”. European Journal of Operational Research, Vol. 147,
No. 3, pp. 499–510, 2003.
[Tulk 93] Tulkens, H. “On FDH Analysis: Some Methodological Issues and Applica-
tions to Retail Banking, Courts and Urban Transit”. Journal of Productivity
Analysis, Vol. 4, No. 1–2, pp. 183–210, 1993.
[Tzio 12] Tziogkidis, P. “The Simar and Wilsons Bootstrap DEA approach: a cri-
tique”. Tech. Rep., Cardiff University, Cardiff Business School, Economics
Section, 2012.
[Wu 05] Wu, D., Liang, L., Huang, Z., and X., S. “Aggregated Ratio Analysis in
DEA”. International Journal of Information Technology and Decision Mak-
ing, Vol. 4, No. 3, pp. 369–384, 2005.
[Yang 07] Yang, H. and Pollitt, M. “Distinguishing Weak and Strong Disposability
among Undesirable Outputs in DEA: The Example of the Environmental
References 158
Efficiency of Chinese Coal-Fired Power Plants”. Electricity Policy Research,
2007.
[Zhou 07] Zhou, P., Poh, K. L., and Ang, B. W. “A non-radial DEA approach to
measuring environmental performance”. Journal of Operational Research,
Vol. 178, No. 1, pp. 1–9, 2007.
[Zies 84] Zieschang, K. “An Extended Farrell Efficiency Measure”. Journal of Eco-
nomic Theory, Vol. 33, pp. 387–396, 1984.