Data envelopment analysis (DEA) is a data-oriented procedure to evaluate the relative performances of a set of homogenous decision making units (DMUs) with multiple incommensurate inputs and outputs. Performance measurement using tools such as DEA needs to construct an empirical production technology set. In this analysis, DMUs are partitioned into two groups: efficient and inefficient. Inefficient DMUs are projected onto efficient frontier in such a way that their inputs are reduced and their outputs are increased. In this sense, finding a projection point with the shortest distance is important and it is a most frequently studied subject in the field of DEA. In this paper, a two-steps procedure is proposed to determine a projection point on the efficient frontier with closest distance. The reference point is constructed in such a way that it is located on the strong defining hyperplane of the DEA technology set. As we will show, the low computational efforts and the guarantee of determining an efficient projection point on the strong efficient frontier are the two important advantages of the proposed model.To show the applicability of the proposed approach, a real case on 28 international airlines is given.
1.
Introduction
Data envelopment analysis (DEA) is a powerful knowledge-based analytical method to evaluate the relative performance of a set of homogeneous decision making units (DMUs) that consumes multiple incommensurate inputs and outputs. DEA initiated in 1978 by Charnes, Cooper and Rhodes and extended by Banker, Charnes and Cooper [1]. In the last three decades, DEA has been applied in a wide range of applications in organizational units. See, for instances, Emrouznejad et al. [2,3].
The main result of DEA as a performance evaluation tool is to partition the DMUs into two groups: efficient and inefficient. Inefficient DMUs have to reduce their inputs and simultaneously increase their outputs to meet the efficient frontier of the production technology set. This means that inefficient DMUs have to lose part of their resources and simultaneously they should try to increase the level of their outputs production. In a rational sight, DMUs are interested to lose less resources and to increase less increment in outputs to meet the frontier. In this sense, determining an efficient projection point with minimal changes in inputs and outputs is an important and interesting subject in the field of DEA and it has attracted considerable attention among researchers in the last decade.
In what follows, we briefly review some of these studies on efficiency measure with closest reference point:
The first study on finding the distance to a reverse convex subset in a normed vector space is studied by Briec and Lemaire [4]. At the same time, Frei and Harker [5] have extended DEA methodology in two substantive ways. First, they developed a method to determine the least-norm projection from an inefficient DMU to the efficient frontier in both the input and output space simultaneously, and second, they introduced the notion of the "observable" frontier and its subsequent projection.
Another work that considered the shortest distance, have proposed by González and Álvarez [6]. They have studied the problem of efficiency improvement and how to identify appropriate benchmarks for inefficient firms to imitate. They argued that the most relevant benchmark is the closest reference firm on the efficient subset of the isoquant.
Lozano and Villa [7] had a different look at the shortest distance. They have advocated determining a sequence of targets, each one within an appropriate, short distance of the preceding. Their approach has two interesting features: (a) the sequence of targets ends in the efficient frontier and (b) the final efficient target is generally closer to the original unit than the one-step projection.
Amirteimoori and Kordrostami [8] proposed a Euclidean distance-based efficiency measure to evaluate the relative efficiency of a set of homogeneous DMU. An alternative Euclidean distance-based efficiency measure is defined in their work and it has been shown that the reference point on the efficient frontier has shortest distance to the original point. They applied their approach to a real case on gas companies. Aparicio et al. [9] have used the full dimensional efficient facets to propose an alternative Russell output measure of technical efficiency.
Aparicio and Pastor [10] have used two simple example to show a drawback of the approach proposed by Amirteimoori and Kordrostami [8]. They showed that in some cases, the reference point obtained from the work of Amirteimoori and Kordrostami [8] is not in the technology set. In order to overcome this drawback, they slightly modified the model introduced by Aparicio et al. [11].
In another study, Aparicio and Pastor [12] have shown that the least distance measures based on Hölder norms satisfy neither weak nor strong monotonicity on the strongly efficient frontier. Then, they provided a solution for output-oriented models that allows assuring strong monotonicity on the strongly efficient frontier.
An et al. [13] proposed a non-oriented DEA approach based on enhanced Russell [14] measure for measuring the environmental efficiency of DMUs and meanwhile, they provided the closest target for the DMU under evaluation to be efficient with less effort.
Aparicio et al. [15] have shown that the existing approaches for determining the least distance without identifying explicitly the frontier structure for graph measures do not work for oriented models. Then, they proposed a methodology for satisfactorily implementing these situations. Razipour et al. [16] have used the problem of closest reference target to find the closest targets in bank branches in Iran.
All the above studies show that the determination of closest efficient targets in production possibility set is an important subject in the field of DEA and it has attracted considerable interest among researchers in recent DEA literature. Despite of this, only a few studies exist that analyze the implications of using closest targets on the technical inefficiency measurement.
In this paper, a two-steps procedure is proposed to determine a projection point on the efficient frontier with closest distance. The reference point is constructed in such a way that it is located on the strong defining hyperplane of the DEA technology set. To do this, we first construct a strong defining hyperplane of the production set corresponding to each inefficient DMU, and then, the DMU is projected to this hyperplane in the direction of gradient vector.
The reminder of this paper is organized as follows: to start the study, the required preliminaries are given in next section. Our proposed approach appears in section 3. To illustrate the applicability of the approach, a real case on 28 international airlines in Asia-Australia is given in section 4. The paper ends with conclusions.
2.
Preliminaries
Suppose there are n DMUj,j=1,...,n and each DMUj uses m inputs to produce s outputs. Specially, DMUj uses inputs xj=(x1j,...,xmj)⩾0 to produce the outputs yj=(y1j,...,ysj)⩾0. The technology set T is defined as the set of all feasible input–output combinations as
By accepting axiom such as constant return to scale (CRS), convexity, inclusion, free disposability of inputs and outputs and minimal extrapolation, Tc is constructed as follows: (Charnes et al. [17]).
In a same way, if we ignore the CRS assumption, in variable returns to scale (VRS) framework, Tv is constructed as follows: (Banker et al. [1]).
The input-oriented envelopment CCR model for evaluation efficiency of DMUo as follows: (Charnes et al. [17]).
The first m constraints in model (4) guarantee that the inputs of the new target unit do not exceed the inputs of DMUo and the second s constraints guarantee that the outputs of the new target unit is not less than the outputs of DMUo. In the above model, DMUo is said to be efficient if and only if in all optimal solutions, we have θ∗=1 and all slack variables are equal to zero. In model (4), if we remove the slack variables and rewrite the constraints in inequality form, the dual formulation of model (4) (known as multiplier form) is expressed as follows:
In model (5), DMUo is said to be strong efficient if the optimal value of the objective function is equal to 1 and there exists at least one optimal solution (u∗1,u∗2,…,u∗s,v∗1,v∗2,…,v∗m) with u∗r>0 and v∗i>0 for all i and r.
Definition 2.1: Let H={(x,y)|uT(y−ˉy)−vT(x−ˉx)=0}∩Tc be a supporting hyperplane of Tc passing through a specific point (ˉx,ˉy) in Tc. Then, H is called strong defining hyperplane if and only if (u,v)>0.
Definition 2.2 (Pareto-Koopmans efficiency): A DMU is said to be strong efficient if and only if it is not possible to improve any input or output without getting worse some other input or output.
For an inefficient DMUo, the reference set consists of all DMUs with λ∗j>0, in which λ∗j is optimal solution to model (4). It is easy to show that all DMUs in the reference set of DMUo are located on a unique supporting surface.
Definition 2.3(Reference supporting surface): For a DMUo, an efficient surface of Tc is called a reference Supporting surface, if it contains the reference units of DMUo.
Based on the structure of Tc or Tv, different strategies can be considered to project an inefficient point to the efficient frontier. The CCR model (4) evaluates the radial efficiency and it does not take the output shortfalls and input excesses in to consideration. However, the existence of nonzero slacks leads to incorrect estimation of efficiencies. Additive model deals directly with input excesses and output shortfalls to project an inefficient DMU to strong efficient frontier. The mathematical formulation of Additive model is as follows:
In this model, DMUo is said to be efficient if and only if the optimal objective value is equal to zero. The dual formulation of the additive model (6) is as follows:
The second constraint in (7) guarantees the positivity of the weights. So, if DMUo prevails as inefficient, model (7) projects it to a strong defining hyperplane. Clearly, the projection point of an inefficient DMUo is not necessarily the closest point on the frontier. It is important and interesting to find a point with closest distance to DMUo under evaluation.
Tone [18] have augmented the additive model (6) by introducing an efficiency measure that is invariant to the units of the data. The slack-based measure (SBM) of efficiency introduced by Tone [18] as follows:
In model (8), we assume that xio>0 for all i. If xio=0, we can remove the term s−ixio from the summation. Furthermore, it can easily be seen that 0⩽ρo⩽1.
Definition 2.4:DMUo is efficient if and only if ρ∗o=1.
Theorem 2.1: If DMUA dominates DMUB so that xA⩽xB and yA⩾yB then we have ρ∗B⩽ρ∗A.
Proof. See Tone [18].
Aparicio et al. [11] proposed a general two-step procedure to find minimum distance on the Pareto-efficient frontier. In the first step, efficient and inefficient DMUs are obtained by one of the classical radial models. Let E be the set of all extreme efficient units. In the second step, the following multiplier model (MADD) is proposed to the members of E:
In which s+ro and s−io are slacks variables and M is a large positive number. In this model DMUo is said to be efficient if and only if the optimal slacks in (MADD) are all zero. Model (9) is a mixed integer linear programming problem and if DMUo is inefficient, its efficient projection on the frontier is closest point.
3.
Closest reference point
As we stated before, in traditional DEA models, the reference point of an inefficient unit is calculated in input or output or jointly orientation. Clearly, the reference points in such orientations are not the closest reference points and we are interested to determine a reference point on the efficient frontier with minimal distance. In this section, an alternative shortest distance method has been developed by using gradient vectors.
Suppose the set of all DMUs is partitioned into four sets E, NE, F and NF in which E is the set of all strong efficient DMUs, NE is the set of all inefficient DMUs, but their reference point belongs to E, F is the set of weak efficient DMUs and NF is the set of inefficient DMUs but their reference point belongs to F. A procedure to partition DMUs into four sets E, NE, F and NF will be given later.
We are interested to find a projection point on the efficient frontier with minimal distance. We first employ the formulations (4) and (5) to determine efficient and inefficient DMUs. Efficient DMUs are belong to E and inefficient DMUs are belong to E'. The weak efficient DMUs are belong to F. also the inefficient DMUs that projection point are on the week supporting frontier, are belong to NF. Therefore NE=E′−(F∪NF). Also the formulation (7) is used to determine the closest reference supporting surface of DMUo∈E′ as follows:
Which (U∗,V∗) is an optimal solution to model (7). Consider the gradient vector (U∗,−V∗) and we now solve the following linear programming problem:
Suppose DMUo:(Xo,Yo)∈NE, we move from (Xo,Yo) in the direction (U∗,−V∗) and t is the step size. Clearly, in the optimality, t∗ is the maximum step size and as we should expect, the obtained projected point is now calculated as
Now suppose DMUo:(Xo,Yo)∈F∪NF. In this case, we solve the following linear quadratic formulation:
In which zi and wr are respectively the i-th and r-th coordination of the i-th input and r-th output of the projected point. (U∗j,V∗j) for DMUj,j=1,...,n is an optimal solution to model (7) when DMUj is under evaluation. The first constraint guarantees that the new projection point is located on the efficient frontier and the second n-1 constraints are given to guarantee the feasibility of the new projection point.
Theorem 3.1: The projection point obtained from the above mentioned procedure is the closest reference point on the strong efficient hyperplane.
Proof. Case 1: let DMUo∈NE because the gradient vector is vertical on the reference hyperplane, therefore, move from (Xo,Yo) in the gradient direction to vertical image on the reference hyperplane is the shortest distance.
Case 2: let DMUo∈F∪NF because ∑mi=1v∗iozi−∑sr=1u∗rowr=0 is the strong reference supporting surface of DMUo, according to the objective function (square distance function) of formulation (12), the verdict is obvious.
Let (X∗o,Y∗o) is the projection point of (Xo,Yo) using the proposed approach and let (U∗o,V∗o) is the multiplier of closest reference supporting surface. Inspired by the efficiency index of Tone [18], we define the inefficiency index ρo as follows:
Proposition 3.1: DMUo∈E if and only if ρo=1.
Proof. Let DMUo∈E then x∗o=xo and y∗o=yo so it's obvious ρo=1. If ρo=1, then
So,
On the other hand, because x∗o⩽xo and y∗o⩾yo so we always have
The above equality holds true if x∗o=xo and y∗o=yo this means that DMUo∈E.
A point to be noted is that in all of the above discussion, the underlying technology set was constant returns to scale technology set. The procedure can easily be extended to variable returns to scale technology set.
At the end of this section, a simple example is used to illustrate the proposed approach. Suppose we have eight DMUs with two inputs and one output. We employ the formulations (4) and (5) to determine efficient and inefficient DMUs. The data set, the efficiency scores and the optimal weights obtained from models (4) and (5) are given in Table 1. As columns 5–11 of Table 1 show, four DMUs D1, D2, D3 and D4 are strong efficient and hence, E={D1,D2,D3,D4} and E′={D5,D6,D7,D8}. As columns 5–8 of Table 1 shows, for D5 and D6 the efficiency scores are one, but the slack variables are not zero, so, F={D5,D6}. Moreover, the projection point of D8 is located on the weak supporting surface and henceNF={D8}. Finally, NE={D7} (NE = E'-F-NF). The Farrell cut of production possibility set is shown in Figure 1.
Now, we use model (7) to determine the closest reference supporting surface, the results are given in columns 3–5 of Table 2. Suppose DMUo∈NE={D7}. We have solved model (10) with (U*, V*) = (5, 1, 1). The optimal value t* is shows in column six of Table 2.
The projection point of D7 is obtained as. (4.895, 2.895, 1.525) Now, consider DMU8∈F∪NF. We solve model (12) as follows:
The projection point of D8 is obtained as (6.96,1.11,1.31).
The projection points of the inefficient DMUs are obtained in a similar manner and the results are shown in table 3. In each row, three different values are given, original data, the results of the CCR model and the results of our proposed model. The last column shows the Euclidean distance from original point to the new projection point (To this end, we have used the simple distance formulation d=√∑mi=1(zi−xio)2+∑sr=1(wr−yro)2). As the results show, in all four DMUs, the distance obtained from our proposed approach is strictly less than the CCR model.
4.
An empirical example
In this section, we apply the proposed procedure to a real data set consisting of 28 international airlines from Asia-Australia, Europe and North America. The data has been taken from Ray [19] and Aparicio et al. [11] have used this data set in their work. As Ray [19] and Aparicio et al. [11], we also used constant returns to scale technology. These 28 international airlines uses four inputs to generate two outputs. The inputs are as follows:
Number of employees (x1).
Millions of gallons fuel (x2).
Other kind of inputs (millions of U.S. dollar equivalent) excluding labor and fuel expenses (x3).
Capital, as the sum of maximum takeoff weights of all aircraft flown multiplied by the number of days flown (x4).
Outputs include:
Passenger-kilometers flown (y1).
Freight tonne-kilometers flown (y2).
The inputs/outputs data are given in Table 4. We first applied the dual formulation of the additive models, in constant returns to scale environment. The results are given in column 3 of Table 5. The input/output weights are also given in columns 4–9. As the results show, nine airlines are relatively efficient and all weights are strictly positive. The classification of DMUs are as follows:
We then applied model (10) for DMUs in NE. The results are listed in tenth column of Table 5. Consider, for example, NIPPON AIR in NE. The optimal value of t* is 138.3009. So, the projection point on the efficient frontier to NIPPON AIR is calculated as (X*, Y*) = (11868.2,721.7, 1869.7, 5935.7, 35399.3,752.3). Now, consider CONTINENTAL AIR in F∪NF, running the proposed approach to this unit and the projection point to this unit is calculated as (35651.37, 1257.374, 2590.639, 9950.374, 69065.55, 1099.626).
The projection points are calculated by three different approaches: CCR model, Aparicio et al. [11] and our proposed approach. The results are listed in Table 6. Now, let us compare the results of the proposed method with other approaches such as CCR model and Aparicio et al. [11]. In Table 6, in the first row of each airline the original data is given, the second row shows the projection points obtained from CCR model, the third row shows the results of Aparicio et al. [11], denoted by mERG, and the fourth row shows the results of our proposed method, denoted by GDM (Gradient Direction Method). As the results show, both approaches, our proposed and Aparicio et al. [11], provided closer projections than the CCR model.
To compare the results of these three different approaches, the distance of each projection point to original unit has been calculated and the results are given in the last column of Table 6. Consider the first airline, NIPPON. At the first and third inputs, the reduction level of our approach is better than one that proposed by Aparicio et al. [11] and for inputs two and four, the reduction level of Aparicio et al. [11] is better than ours. However, in whole sense, the distance from the original point to our projection point is 469.903, while this distance is 1552.975, in Aparicio et al. [11]. Comparing the results of the two approaches for other airlines, we have found that, except for AIR CANAD (DMU17) and AIR CONTINENTAL (DMU21), in all other airlines, the distances between projection points and observed airlines in our approach is less than the approach proposed by Aparicio et al. [11]. However, we checked the Air-Canada and AIR CONTINENTAL and it has been found that the projection points provided by Aparicio et al. [11] to these two DMUs are not efficient. It should be pointed out that we do not claim that our approach is better than the previous approach of Aparicio et al. [11], but, we provide another projection point with minimal distance. Moreover, we just use the results of Aparicio et al. [11] to confirm our results.
As it is observed in the column 9 of Table 6, the distance of GDM method in all inefficient Airlines is evidently lower than the mERG method.
5.
Conclusion
Benchmarking techniques, especially data envelopment analysis uses rational ideal evaluation to analyze the relative performances of decision making units. In this sense, a specific DMU is compared with a reference point on the efficient frontier of the production possibility set. In a rational sight, we may expect the reference point has the shortest distance to the DMU under consideration. So, finding a reference point to an inefficient DMU on the efficient frontier with closest distance is an important subject that recently has attracted considerable attention among researchers. This issue is important in the sense that inefficient DMUs could be efficient in an easiest manner. In this paper, we proposed an alternative procedure to determine a projection point with minimal changes and shortest distance on the strong efficient frontier. The gradient vectors of the reference supporting surfaces of the production technology set are used to determine closest reference points. The low computational efforts and the guarantee of determining an efficient projection point on the strong efficient frontier are the two important advantages of the proposed model. A real case on 28 internationals airlines from Asia-Australia, Europe and North America is given to show the real applicability of the proposed approach.
Conflict of interest
The authors declare no conflict of interest.