# Data envelopment analysis

(Redirected from Data Envelopment Analysis)

Data envelopment analysis (DEA) is a nonparametric method in operations research and economics for the estimation of production frontiers.[1] It is used to empirically measure productive efficiency of decision making units (DMUs). Although DEA has a strong link to production theory in economics, the tool is also used for benchmarking in operations management, where a set of measures is selected to benchmark the performance of manufacturing and service operations. In benchmarking, the efficient DMUs, as defined by DEA, may not necessarily form a “production frontier”, but rather lead to a “best-practice frontier” (Charnes A., W. W. Cooper and E. Rhodes (1978)).[2]

In contrast to parametric methods that require the ex-ante specification of a production- or cost function, non-parametric approaches compare feasible input and output combinations based on the available data only.[3] DEA, as one of the most commonly used non-parametric methods owes its name to its enveloping property of the dataset's efficient DMUs, where the empirically observed, most efficient DMUs constitute the production frontier against which all DMUs are compared. DEA's popularity stems from its relative lack of assumptions, ability to benchmark multi-dimensional inputs and outputs as well as computational ease owing to it being expressable as a linear program, despite aiming to calculate efficiency ratios.[4]

## History

Building on the ideas of Farrell (1957),[5] the seminal work "Measuring the efficiency of decision making units" by Charnes, Cooper & Rhodes (1978)[1] applies linear programming to estimate an empirical production technology frontier for the first time. In Germany, the procedure was used earlier to estimate the marginal productivity of R&D and other factors of production. Since then, there have been a large number of books and journal articles written on DEA or applying DEA on various sets of problems.

Starting at the CCR model by Charnes, Cooper and Rhodes,[6] many extensions to DEA have been proposed in the literature. They range from adapting implicit model assumption such as input and output orientation, distinguishing technical and allocative efficiency,[7] adding limited disposability[8] of inputs/outputs or varying returns-to-scale[9] to techniques that utilize DEA results and extend them for more sophisticated analyses, such as stochastic DEA[10] or cross-efficiency analysis.[11]

## Techniques

In an one-input, one-output scenario, efficiency is merely the ratio of output over input that can be produced and comparing several entities/DMUs based on it is trivial. However, when adding more inputs or outputs the efficiency computation becomes more complex. Charnes, Cooper, and Rhodes (1978)[1] in their basic DEA model (CCR) define the objective function to find ${\displaystyle DMU_{j}'s}$  efficiency ${\displaystyle (\theta _{j})}$  as:

${\displaystyle \max \quad \theta _{j}={\frac {\sum \limits _{m=1}^{M}y_{m}^{j}u_{m}^{j}}{\sum \limits _{n=1}^{N}x_{n}^{j}v_{n}^{j}}},}$

where the ${\displaystyle DMU_{j}'s}$  known ${\displaystyle M}$  outputs ${\displaystyle y_{1}^{j},...,y_{m}^{j}}$  are multiplied by their respective weights ${\displaystyle u_{1}^{j},...,u_{m}^{j}}$  and divided by the ${\displaystyle N}$  inputs ${\displaystyle x_{1}^{j},...,x_{n}^{j}}$  multiplied by their respective weights ${\displaystyle v_{1}^{j},...,v_{n}^{j}}$ .

The efficiency score ${\displaystyle \theta _{j}}$  is sought to be maximized, under the constraints that using those weights on each ${\displaystyle DMU_{k}\quad k=1,...,K}$ , no efficiency score exceeds one:

${\displaystyle {\frac {\sum \limits _{m=1}^{M}y_{m}^{k}u_{m}^{j}}{\sum \limits _{n=1}^{N}x_{n}^{k}v_{n}^{j}}}\leq 1\qquad k=1,...,K,}$

and all inputs, outputs and weights have to be non-negative. To allow for linear optimization, one typically constrains either the sum of outputs or sum of inputs to equal a fixed value (typically 1).

Because this optimization problem's dimensionality is equal to the sum of its inputs and outputs, selecting the smallest number of inputs/outputs that collectively, accurately capture the process one attempts to characterize is crucial. Because the production frontier envelopment is done empirically, several guidelines exist on the minimum required number of DMUs for good discriminatory power of the analysis, given homogeneity of the sample. This minimum number of DMUs varies between number twice the sum of inputs and outputs (${\displaystyle 2(M+N)}$ ) and twice the product of inputs and outputs (${\displaystyle 2MN}$ ).

Some advantages of DEA approach are:

• no need to explicitly specify a mathematical form for the production function
• capable of handling multiple inputs and outputs
• capable of being used with any input-output measurement, although ordinal variables remain tricky
• the sources of inefficiency can be analysed and quantified for every evaluated unit
• using the dual of the optimization problem identifies which DMUs is evaluating itself against which other DMUs

Some of the disadvantages of DEA are:

• results are sensitive to the selection of inputs and outputs
• high efficiency values can be obtained by being truly efficient or having a niche combination of inputs/outputs
• the number of efficient firms on the frontier increases with the number of inputs and output variables
• a DMU's efficiency scores may be obtained by using non-unique combinations of weights on the input and/or output factors

## Example

Assume that we have the following data:

• Unit 1 produces 100 items per day, and the inputs per item are 10 dollars for materials and 2 labour-hours
• Unit 2 produces 80 items per day, and the inputs are 8 dollars for materials and 4 labour-hours
• Unit 3 produces 120 items per day, and the inputs are 12 dollars for materials and 1.5 labour-hours

To calculate the efficiency of unit 1, we define the objective function (OF) as

• ${\displaystyle MaxEfficiency:(100u_{1})/(10v_{1}+2v_{2})}$

which is subject to (ST) all efficiency of other units (efficiency cannot be larger than 1):

• Efficiency of unit 1: ${\displaystyle (100u_{1})/(10v_{1}+2v_{2})\leq 1}$
• Efficiency of unit 2: ${\textstyle (80u_{1})/(8v_{1}+4v_{2})\leq 1}$
• Efficiency of unit 3: ${\displaystyle (120u_{1})/(12v_{1}+1.5v_{2})\leq 1}$

and non-negativity:

• ${\displaystyle u,v\geq 0}$

A fraction with decision variables in the numerator and denominator is nonlinear. Since we are using a linear programming technique, we need to linearize the formulation, such that the denominator of the objective function is constant (in this case 1), then maximize the numerator.

The new formulation would be:

• OF
• ${\displaystyle MaxEfficiency:100u_{1}}$
• ST
• Efficiency of unit 1: ${\displaystyle 100u_{1}-(10v_{1}+2v_{2})\leq 0}$
• Efficiency of unit 2: ${\textstyle 80u_{1}-(8v_{1}+4v_{2})\leq 0}$
• Efficiency of unit 3: ${\displaystyle 120u_{1}-(12v_{1}+1.5v_{2})\leq 0}$
• Denominator of nonlinear OF: ${\displaystyle 10v_{1}+2v_{2}=1}$
• Non-negativity: ${\displaystyle u,v\geq 0}$

## Extensions

A desire to Improve upon DEA, by reducing its disadvantages or strengthening its advantages has been a major cause for many discoveries in the recent literature. The currently most often DEA-based method to obtain unique efficiency rankings is called cross-efficiency. Originally developed by Sexton et al. in 1986,[11] it found widespread application ever since Doyle and Green's 1994 publication.[12] Cross-efficiency is based on the original DEA results, but implements a secondary objective where each DMU peer-appraises all other DMU's with its own factor weights. The average of these peer-appraisal scores is then used to calculate a DMU's cross-efficiency score. This approach avoids DEA's disadvantages of having multiple efficient DMUs and potentially non-unique weights.[13] Another approach to remedy some of DEA's drawbacks is Stochastic DEA,[10] which synthesizes DEA and SFA.[14]

## Notes

1. ^ a b c Charnes A., W. W. Cooper and E. Rhodes (1978). “Measuring the Efficiency of Decision Making Units.” EJOR 2: 429-444.
2. ^ For more details and discussions, see Chapter 8 in Sickles, R., & Zelenyuk, V. (2019). Measurement of Productivity and Efficiency: Theory and Practice. Cambridge: Cambridge University Press. doi:10.1017/9781139565981 https://assets.cambridge.org/97811070/36161/frontmatter/9781107036161_frontmatter.pdf
3. ^ Cooper, William W.; Seiford, Lawrence M.; Tone, Kaoru (2007). Data Envelopment Analysis: A Comprehensive Text with Models, Applications, References and DEA-Solver Software (2 ed.). Springer US. ISBN 978-0-387-45281-4.
4. ^ Cooper, William W.; Seiford, Lawrence M.; Zhu, Joe, eds. (2011). Handbook on Data Envelopment Analysis. International Series in Operations Research & Management Science (2 ed.). Springer US. ISBN 978-1-4419-6150-1.
5. ^ Farrell, M. J. (1957). "The Measurement of Productive Efficiency". Journal of the Royal Statistical Society. Series A (General). 120 (3): 253–290. doi:10.2307/2343100. ISSN 0035-9238. JSTOR 2343100.
6. ^ Charnes A., W. W. Cooper and E. Rhodes (1978). “Measuring the Efficiency of Decision Making Units.” EJOR 2: 429-444.
7. ^ Fried, Harold O.; Lovell, C. A. Knox; Schmidt, Shelton S. (2008-02-04). The Measurement of Productive Efficiency and Productivity Growth. Oxford University Press. ISBN 978-0-19-804050-7.
8. ^ Cooper, William W.; Seiford, Lawrence; Zhu, Joe (2000). "A unified additive model approach for evaluating inefficiency and congestion with associated measures in DEA". Socio-Economic Planning Sciences. 34 (1): 1–25. doi:10.1016/S0038-0121(99)00010-5.
9. ^ Banker, R. D.; Charnes, A.; Cooper, W. W. (1984-09-01). "Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis". Management Science. 30 (9): 1078–1092. doi:10.1287/mnsc.30.9.1078. ISSN 0025-1909.
10. ^ a b Olesen, Ole B.; Petersen, Niels Christian (2016-05-16). "Stochastic Data Envelopment Analysis—A review". European Journal of Operational Research. 251 (1): 2–21. doi:10.1016/j.ejor.2015.07.058. ISSN 0377-2217.
11. ^ a b Sexton, Thomas R. (1986). "Data envelopment analysis: Critique and extension". New Directions for Program Evaluation. 1986 (32): 73–105. doi:10.1002/ev.1441.
12. ^ Doyle, John; Green, Rodney (1994-05-01). "Efficiency and Cross-efficiency in DEA: Derivations, Meanings and Uses". Journal of the Operational Research Society. 45 (5): 567–578. doi:10.1057/jors.1994.84. ISSN 0160-5682. S2CID 122161456.
13. ^ Dyson, R. G.; Allen, R.; Camanho, A. S.; Podinovski, V. V.; Sarrico, C. S.; Shale, E. A. (2001-07-16). "Pitfalls and protocols in DEA". European Journal of Operational Research. Data Envelopment Analysis. 132 (2): 245–259. doi:10.1016/S0377-2217(00)00149-1.
14. ^ Ole B. Olesen, Niels Christian Petersen (2016) Stochastic Data Envelopment Analysis—A review, European Journal of Operational Research, 251 (1): 2-21, https://doi.org/10.1016/j.ejor.2015.07.058