1.
Introduction
Optimal design is a common tool in solving industrial problems. In order to determine the production process or evaluate the product quality, experimenters usually take into account the optimal design methods to collect more information. The optimal design is to find a best design with given model and certain optimal criteria in a feasible region. When we deal with practical optimal design problems, the feasible region is usually irregular due to the limitation of some objective conditions, which leads to the constrained experimental regions and the limitations are called constraints. Montgomery et al. [1], Borkowski and Piepel [2] discussed a large number of constrained industrial optimal design examples. It is difficult to find the exact design if the design region is irregular due to constraints.
It is difficult to find the exact design if the design region is constrained. Therefore, it turns out to be a new trend to find the approximate solutions of constrained optimal design problems by computer algorithms.
Scholars have put forward many approaches to solve the constrained optimal design problems. Snee and Marquardt [3] proposed XVERT algorithm to solve constrained mixture design problems. Mitchell [4] applied DETMAX algorithm to construct D-optimal design. Cook and Nachtsheim [5] presented Modified Fedorov Exchange (MFE) algorithm for D-optimal design. According to the iteration processes of the earlier algorithms, the optimality of a design mainly depends on the candidate sets or parameters.
Biomimetic algorithms helped us to find more accurate numerical solutions in the past two decades. For example, Heredia-Langner et al. [6,7] used Genetic Algorithm (GA) to calculate D-optimum, Q-optimum constrained experimental design problems. Yu [8] proposed Cocktail Algorithm to find the D-optimal design of some experimental problems. Liu and Liu [9] constructed constrained uniform mixture design by a special method. Li et al. [10] applied Mixture Design Random Search (MDRS) algorithm to mixture design, they solved D- and A-optimal designs of Scheffé model and Becker model in design region with and without constraints. Duan et al. [11] employed Efficient Computational algorithm for solving optimal continuous experimental designs. In recent years, Particle Swarm Optimization (PSO) algorithm is very common in design problems. Wong et al. and Yang et al. [12,13] applied PSO and Fractional-Order PSO algorithm to obtain optimal designs under D-, A-, I- criteria for linear and Quadratic models.
After an overview of the above biomimetic algorithms, we find that the cocktail algorithm, PSO algorithm and Efficient Computational algorithm have not been applied to the design problem with a constrained region, while the MDRS algorithm and Liu's method have only be used for the mixture constrained design problems. GA can be used for all above situations, but the existing results are not accurate. Thus a stable algorithm with wide application and high precision is imperative.
Differential Evolution (DE) algorithm is well known for its stability and efficiency among the existing biomimetic algorithms. It was proposed by Price [14] first, then Storn et al. and Price et al. [15,16] gave more details and examples of DE algorithm to further shown its advantages. DE algorithm is widely used in many fields, such as Zhao et al. [17] forecasted the per capita annual net income of rural households in China by DE algorithm and Grey model, Rajesha et al. [18] planned least cost generation expansion with DE algorithm, Xu et al. [19] used DE algorithm to find high-dimensional D-optimal designs for logistic models, Deng et al. [20] proposed the EMMSIQDE, a new differential mutation strategy of a difference vector to enhance the search ability and descent ability, etc. DE algorithm gives a steady performance in solving complex problems and above all it has low requirements on feasible region and initial values. DE algorithm is suitable for both optimal design and mixture design problems with irregular feasible region.
One of the difficulties in solving the constrained experimental design problems is that we don't know the exact number of support points before calculation. The general method is setting more initial points than the number of exact support points, then combine the points with short range to a support point. The computation quantity is enormous if we use DE algorithm directly to this problem. Therefore we propose a Multi-stage Differential Evolution (MDE) algorithm for the constrained optimal design problems. MDE algorithm is composed of two stages: calculations of D-optimal support points and the corresponding measures, both stages base on DE algorithm. MDE algorithm can run on any convex feasible region, and sufficient random initial values selected in the feasible region can effectively ensure the convergence. It maintains the stability and effectiveness of DE algorithm while reduces the computational cost.
The remainder of the paper is organized as follows. In section 2, we introduce some basic concepts of optimal design and D-optimal criterion, then we give the process of finding the D-optimal design by MDE algorithm. Section 3 discussed two examples, i.e. an optimal design problem with linear constraints and a mixture design problem with nonlinear constraint. The conclusions and future research directions are given in the last section.
2.
MDE algorithms for D-optimal design
2.1. Optimal design and D-optimal criterion
The optimal design method [21] is widely used and D-optimal is the most popular design criterion. When the model is given, the unknown parameters can be calculated efficiently by the data from optimal design. Before finding an optimal design, we need to select a response model first. Scholars usually choose the general linear models as the response model for convenience. Myers et al. [22] discussed the general linear models in details. Dasgupta [23] studied the compromise designs in heteroscedastic linear model. Chaloner [24] contributed to optimal Bayesian experimental design for linear models. The general linear model is
where η(x) is the expectation of Y, which is the outcome of an experiment with a given point x. β is a p×1vector of unknown parameters. We denote the coordinates of the experiment point x∈Ω as a 1×q vector x=(x1,⋯,xq), f(x)=(f1(x),f2(x),⋯,fp(x))T is a given function vector, here Ω is a set of all possible design points. Generally, ε(x) is a random error with zero mean and constant variance.
In this paper, we focus on the approximate optimal design and assume that Ω is a compact set, therefore the approximate optimal design always exists. That is, there is a probability measure in the design space. Generally, a design ξ can be expressed as a combination of k design points x1,⋯,xk and their corresponding measures r1,⋯,rk,
where r1+⋯+rk=1. Given a sample of n known trials, we can take nri observation value at each xi,i=1,⋯,k to implement the design so thatnr1+⋯+nrk=n, and nri,i=1,2,⋯,k is an integer. This means that an approximate optimal design is a probability measure on Ω. For a given experimental design problem, the set of all possible designs is denoted by Ξ.
The information matrix of approximate design ξ is
A design is called an optimal design with constraints if the experimental point xi,i=1,⋯,k is limited in the constrained design region χ, see Cook et al [25]. Mixture design focuses on the proportions of mixture components, i.e. all the experimental design points are limited to a simplex Sq−1,
Further details of mixture design can be found in [26].
D-optimality is one of the most widely used criteria in experimental design problems. We can get D-optimum by choosing the appropriate design ξ to maximize the determinant value of the Fisher information matrix M(ξ), or minimumdet(M−1(ξ)).
In the problem of finding D-optimal design, the variance function is
ξD is a D-optimal design of the linear model (1) if and only if it satisfies the following general equivalence theorems, see [27].
(2) maxx∈Ωd(x,ξD)=p.
D-efficiency is usually used to compare two k runs D-optimal designs, the D-efficiency in constrained optimal design problem can be expressed as follows
where Mi=∫χf(xi)f′(xi)d(x),i=1,2come from different D-optimal designs of same constrained optimal design problem.
2.2. MDE algorithms for constrained optimal design problems
When we try to find an optimal design with MDE algorithm, we are solving an optimization problem
where objective function is determined by the optimal criterion.
In this paper, we consider that a D-optimal design consists of two parts: D-optimal experimental support points and the correlation measures, then we calculate them respectively by the following two stages.
2.2.1. Stage 1- finding D-optimal design points
At this stage, we assume that all experimental design points have the same measure, then we only focus on finding the experimental design points.
2.2.1.1. Encoding
The calculation process of MDE algorithm begins with the coding of design points. For model (1), in order to convert the design points in design ξ into the form in which we can solving by MDE algorithm, we code it as a row vector X,
where X is called design points chromosome vector, in case of optimal design problems with constraints, s⩾k, xi=(xi1,xi2,⋯,xiq)∈χ,i=1,⋯,s.
It is obvious that the design points in a design ξ correspond to a design point chromosome vector X one by one. Each element in the design point chromosome vector is called a gene, a design points chromosome X contains a total of s×q genes.
2.2.1.2. Selection of initial value
To ensure the implementation of MDE algorithm, we need to establish a design points population of N1 design points chromosome vectors A(t)N1, where
N1 is called design points population size, t is the generation of population, t∈Z,t⩾0, A(t)N1 is the parent population of A(t+1)N1, and A(t+1)N1 is child population A(t)N1, X(t)i is the tth generation of ith design points chromosome vector, the design points population size N1⩾3sq by experience. We call A(0)N1 the initial design points population, which is given randomly on the feasible region rather than chosen especially.
2.2.1.3. Mutation
For tth generation population, we choose three different X(t)k,X(t)l,X(t)m,k,l,m∈[1,N1],k≠l≠m≠i for each X(t)i,i∈[1,N1]. Mutation operation is performed to generate mutated chromosomes V1i for each X(t)i,i=1,⋯,N1 with formula
where scale factor F∈[0,2]. The mutation chromosome V1i is retained for the next cross transformation.
2.2.1.4. Crossover
In this step, we obtain a crossover chromosome U1i by compare each gene of X(t)i(1⩽i⩽N1) with its mutated chromosome V1i. For a given cross factor CR∈[0,1], the crossover approach is to take a random number b∈[0,1] for the jth gene in X(t)i,j=1,⋯,sq, let
After processing all the genes in X(t)i(1⩽i⩽N1), crossover chromosomes U1i,i=1,⋯,N1 will be generated. If there is no special value tendency, CR will be taken as 0.5.
2.2.1.5. Selection and decoding
We decode the crossover chromosomes U1i,i=1,⋯,N1 and X(t)i,i=1,⋯,N1 into uniform measure designs with ξU1i and ξX(t)i respectively, the child chromosome vector X(t+1)i,i=1,⋯,N1 is decided by formula
The selection process ensures that the chromosome with larger objective function value will be chosen into the child population, i.e. the iteration with MDE algorithm holds a correct evolution direction and converges fast.
2.2.1.6. Conditions of termination
The convergence of MDE algorithm ensures there must exist T1, so that after the iteration times t(t⩾T1), X(t)i→X,i=1,⋯,N1. Therefore, X is the design points of D-optimal design under the uniform measure assumption.
The child population is obtained after mutation, crossover and selection of the parent population. For each child population, it needs to be evaluated whether the termination condition has been satisfied. If the termination condition is satisfied, the calculation ends and the optimal solution is obtained. Otherwise, go back to the mutation process and the iteration continues.
After merging nearby points in X, we obtain XD=(xD1,xD2,⋯,xDk),1⩽k⩽s, here xD1,xD2,⋯,xDk are D-optimal support points, k is the number of D-optimal support points.
2.2.2. Stage 2-solving the corresponding measure
In this section, we calculate the corresponding measure r1,r2,⋯,rk to get the D-optimal design.
2.2.2.1. Encoding
Let
where R is called measure chromosome vector, ri is the corresponding measure ofxDi,i=1,⋯,k, k∑i=1ri=1. Each element in the measure chromosome vector is called a gene, a measure chromosome R contains a total of k genes.
2.2.2.2. Selection of initial value
Establish a measure population B(t)N2, where
N2 is called measure population size, t is the generation of population, t∈Z,t⩾0, B(t)N2 is the parent population, and B(t+1)N2 is child population, R(t)i is the tth generation of ith measure chromosome vector, here N2⩾3k by experience. B(0)N2 is the initial measure population, and R(0)i,i=1,⋯,N2 come from Sk−1 randomly.
2.2.2.3. Mutation
For generation t, we choose three different R(t)k,R(t)l,R(t)m,k,l,m∈[1,N2],k≠l≠m≠i for each R(t)i,i∈[1,N2]. We generate mutated chromosomes V2i,i=1,⋯,N2 for each R(t)i,i=1,⋯,N2 with formula
2.2.2.4. Crossover
We obtain a crossover chromosome U2i by following formula
where i=1,⋯,N2,j=1,⋯,k.
2.2.2.5. Selection and decoding
For design ξR, the design points are D-optimal support points XD and the corresponding measures are measure chromosome vector R.
2.2.2.6. Conditions of termination
There must be T2, when t⩾T2,
where RD=(rD1,rD2,⋯,rDk) is the limit of R(t)i, T2 is the generation of measure population. If last formula is satisfied, we obtain the D-optimal design
of constrained optimal design problem, otherwise go back to 2.2.2.3.
3.
Results
We programmed the following examples in Matlab with F=0.9,CR=0.5.
3.1. Experiment of adhesive bonding strength
An industrial experiment of adhesive bonding strength is discussed in detail by Montgomery et al. [1] and Heredia-Langner et al [7]. The experimenters used an adhesive to bond two parts together and then cured the adhesive at an elevated temperature. It is known that the adhesive strength is related to the amount of adhesive and bonding temperature. If the amount of adhesive is too little and the curing temperature is too low, the bonding strength will be poor. If the amount of adhesive is too much and the curing temperature is too high, the parts may be damaged. The amount of adhesive and the curing temperature should be constrained by objective conditions. We should give an approximate D-optimal design to arrange the experiment more effectively. Here, let x1 is the amount of adhesive, x2 is the bonding temperature. After standardization, the experimental area is the classical optimal design area[−1,1]×[−1,1], and x1,x2 should be within a constrained region x1+x2⩽1, x1+x2⩾−0.5. It is assumed that the adhesive strength y is a quadratic function ofx1,x2,
Actually it is an optimal design problem with linear constraints. In order to be consistent with the previous works, we set 12 design points, each with 2 components, i.e. xi=(x1,x2),i=1,⋯,12.
According to stage 1, Table 1 is the result of the D-optimal design points with equal measures.
We calculate 12 runs D-optimal designs of adhesive bonding strength by several typical algorithms and the results are shown in Table 2. It is easy to see that the first stage of MDE algorithm gives the minimum det(M−1) and the maximum D-efficiency in following 4 D-optimal designs, which means MDE algorithm is most accurate in following algorithms.
We combine nearby design points in Table 1, run stage 2 of MDE algorithm, the result is shown in Table 3. There are 8 support points in this D-optimal design. The amount of adhesive and bonding temperature are shown in the column 2 and 3 respectively. Besides extreme vertices, there are two particular support points (0.1223, 0.1037) and (−0.3151, −0.1849) included in the optimal design. The data in column 4 is the related measures of the support points, and the results show that the measures of support points are not equal due to the existence of constraints. The values of variance function at the support points are displayed in the last column.
Figure 1 shows that we obtain an extraordinary accurate D-optimal design with constraints. Figure 1(a) is the contour map of variance function. If the measures at the support points are not considered, the variance function can show as Figure 1(b). Figure 1(c) is the variance function of D-optimal design. We find that the variance function is approximately equal to 6 at the D-optimal design support points and less than 6 at other points in the feasible region. The result is consistent with the general equivalence theorem of D-optimal design.
3.2. Mixture Becker model with nonlinear constraint
Liu et al. and Li et al. [9,10] discussed an example of mixture Becker model with nonlinear constraint. The Becker model is
and the feasible region is
We calculate this D-optimal design example with MDE algorithm, the calculations are shown as Table 4. We solve this problem by the first stage of MDE algorithm, then combine nearby points to get 9 support points, which are listed in the second column. The third column is the measures corresponding to the support points, which are obtained by the second stage of MDE algorithm. The variance function values at each support point are given in the last column, and all the values are less than 7.
Figure 2 shows that the MDE algorithm is valid for nonlinear constrained mixture design problems. Figure 2(a) is the contour map of the variance function. Figure 2(b) is the variance function without considering measures. Figure 2(c) shows the variance function of Mixture D-optimal design of Becker model with nonlinear constraints. The value of variance function is close to 7 at each support points and less than 7 at the rest points in the feasible region, which means the general equivalence theorem holds.
4.
Conclusion
Reliable results can usually be obtained when MDE algorithm is used to solve constrained D-optimal design problems. For a given constrained design problem, the significant advantage of MDE algorithm is that the global optimal design can be obtained in many situations. Another advantage of MDE algorithm is that any set with sufficient initial values can lead to a global optimal design. The accuracy of the solution can be determined by the parameter in the algorithm, and more iterations are needed for high-precision solution.
MDE algorithm has low requirements to models, initial values and feasible regions, so it is suitable in solving constrained optimal design problems. We will try to apply MDE algorithm to the following optimal design problems: A-optimal design problems, E-optimal design problems, optimal design problems with heteroscedasticity, multi-response optimal design problems or the combination of the above problems.
Acknowledgements
The authors would like to appreciate all the reviewers and the editors for their careful reviews and constructive suggestions to help us improve the quality of this paper. This research was funded by National Nature Sciences Foundation of China under Grant (No.12071096), Scientific research project of Guangdong Bureau of Traditional Chinese Medicine (No.20201198) and Quality engineering construction project of Department of Education of Guangdong Province (No.220030301-36).
Conflicts of interest
The authors declare that they have no conflicts of interest.