Phosphate deposits in south-western Iran are part of the South Tethyan Phosphogenic Province, a huge carbonate-dominated strata that extends to the Middle East. The Tethyan phosphorites of Iran are dated Eocene-Oligocene (Pabdeh Formation) and categorized as low-grade ore deposits on a global scale. Depositional conditions of the facies indicate that the Pabdeh Formation was deposited on a carbonate ramp setting as a distally steepened ramp. Under such an environment, turbidity currents transported phosphate particles from the back-shoal setting to the deeper middle and outer ramp of the ocean where they were suspended and deposited as shell-lag and phosphate lamination. Microfacies studies demonstrate that all the phosphatic ooids and phosphatized foraminifera, fish scales, bones and phosphatic intraclasts reworked from shallow parts of the Tethyan Ocean to deeper parts with the help of turbidity currents. Analysis and interpretation of the data reveal positive correlation between. REE+Y and P2O5 in all studied sections which attests to their strong coherence as a geochemical group. The shale normalized REE patterns of Mondun phosphorites are characterized by negative Ce anomalies. This anomaly indicate that the depositional environment was oxic and highly reworked, bioturbated with higher energy realm during phosphate deposition, conversely Nill section with Ce enrichment reflect conditions of relatively deeper water sedimentation. These geochemical findings are in accordance with microfacies studies which indicate shallow and high energy condition for Mondun section with negative cerium anomalies and a deep ramp setting for Nill and Siah sections which denote a positive cerium anomalies in REE patterns.
1.
Introduction
In this paper, we consider the following unconstrained optimization problem:
where f:Rn→R is a continuously differentiable function and its gradient is denoted by g(x)=∇f(x). Generally, (1.1) iterates along the following form:
where sk=αkdk, dk is the search direction, αk>0 is a step length often obtained by a line search along dk.
There are many methods to solve unconstrained optimization problem (1.1). Among these methods, The Newton method has a second-order convergence rate when the Hessian matrix ∇2f(xk) is positive definite, but it cannot ensure that the direction chosen for the objective function at xk is a descent while solving unconstrained optimization problems. In addition, every iterations of the Newton method requires the second-order gradient of the objective function, namely the Hessian matrix, which is computationally complex for large-scale problems. To address large-scale problems more effectively, the quasi-Newton method was proposed. The quasi-Newton approach updates the search direction using an approximation Hessian matrix instead of the true Hessian matrix used in the Newton method. This approach can reduce the computation of second-order derivatives, lower the computational complexity, and handle non-differentiable functions in some special cases [1].
The quasi-Newton method is recognized as one of the excellent iterative methods for solving large-scale unconstrained optimization problems (for relevant research, see [22–25]). The fundamental concept of the quasi-Newton technique is to substitute an approximate matrix Bk for the Hessian matrix ∇2f(xk) in the Newton method [1]. In order to satisfy the secant equation, i.e.,
where sk=xk+1−xk, yk=gk+1−gk, the approximate matrix Bk of the Hessian matrix is constructed using the quasi-Newton update formula. The direction of the quasi-Newton method can be calculated directly by the following formula:
where Hk is the approximation of ∇2f(xk)−1 and satisfies secant equation, Hk+1yk=sk, for all k≥0.
The numerical performance of the quasi-Newton method is directly impacted by the updating of Hk. There are many updated formulas of Hk such as the DFP formula, the BFGS formula, and the SR1 formula. These methods, which are composed of different updating formulas, are particularly effective in solving unconstrained optimization problems and they have promising computational performance in practical problems.
The BFGS method, independently proposed by Broyden, Fletcher, Goldfarb, and Shanno [9–13], is one of the most widely used and successful quasi-Newton methods. Bk is always positive definite during the computation; hence, the BFGS method keeps the convergence and stability for optimization problems. Moreover, the BFGS method has emerged as the preferred option for many applications, including neural networks, image processing, and machine learning (see [26–28]). The BFGS formula is as follows:
The BFGS formula satisfies the secant equation (1.3). Many researchers have improved the formula (1.3) to enhance numerical stability and the accuracy of the approximation Hessian matrix. Zhang et al. [2], Zhang and Xu [3] put forward a modified secant condition
where τ>0 and wk is a vector parameter satisfying sTkwk≠0.
On the other hand, in order to reduce memory storage and improve the computational efficiency of the algorithm, a memoryless quasi-Newton method is proposed to solve large-scale unconstrained optimization problems, where the inner product of multiple vectors is employed to determine the search direction [4]. Memoryless technology has been employed in numerous works; for example, Babaie-Kafaki and Aminifard [5], Aminifard, Babaie-Kafaki and Ghafoori [6], Babaie-Kafaki, Aminifard and Ghafoori [7], Jourak, Nezhadhosein, and Rahpeymaii [8] have applied a memoryless technique to design and develop new quasi-Newton methods for solving large-scale unconstrained optimization problems, and Narushima, Nakayama, Takemura et al. [29] propose a memoryless quasi-Newton method in Riemannian manifolds.
Our goal in this research is to present an effective augmented memoryless BFGS method for solving unconstrained optimization problems. The major contributions of this paper have at least three aspects as follows:
(1) To establish an effective optimization algorithm for large-scale unconstrained optimization problems, a new augmented memoryless BFGS updating formula is provided that is based on a specific modified secant equation.
(2) To enhance the effectiveness of the experiment, the condition number may be minimized in order to obtain the scaling parameters.
(3) We prove the global convergence of the algorithm and numerical experiments that show the efficiency of the algorithm.
The organization of the article is as follows. In Section 2, we present an augmented memoryless BFGS technique along with the algorithm framework. The descent property and the global convergence of the proposed algorithm are demonstrated in Section 3. In Section 4, the numerical experiment demonstrates the effectiveness of our method in solving large-scale unconstrained optimization problems and nonlinear equations. A conclusion to this work is presented in Section 5.
2.
An augmented memoryless BFGS method
Inspired by Zhang and Xu [3] and Aminifard, Babaie-Kafaki, and Ghafoori [6], to improve the precision of the solution, we select wk=yk in Eqs (1.7)–(1.9), get a modified secant equation,
where ˉyk=(1+τk)yk, τk=τηksTkyk and ηk=max{0,ˇηk}.
We modify the BFGS iteration formula (1.5) by a rank-1 correction, which we refer to as the augmented BFGS (ABFGS) formula
BABFGSk+1sk=ˉyk because BBFGSk+1sk=yk. Therefore, (2.2) satisfies the modified secant equation (2.1).
As is known by the quasi-Newton property, when steplength αk satifies Wolfe line search,
with 0<δ<ρ<1, sTkyk>0 is established. By Theorem 5.2.2 of [1], we know (1.5) is positive definite. Therefore, BABFGSk is a positive definite matrix since τk≥0. In other words, the positive definiteness is passed down to the ABFGS update formula. As a result, the search directions of ABFGS are descent.
The scaled memoryless BFGS (SMBFGS) method is considered an effective tool for solving large-scale unconstrained optimization problems. In order to obtain SMBFGS formula, Bk can be replaced by 1ϑkI in (1.5), namely,
where ϑk>0 is called the scaling parameter. Similarly, by replacing Hk with ϑkI, we can get the SMBFGS iteration formula for the inverse of the Hessian matrix
In the formula (2.2), we replace BBFGSk+1 with BSMBFGSk+1 to make a rank-1 correction, and then, using the Sheran-Morrison formula [1], we get the augmented memoryless BFGS formula (AMBFGS), i.e.,
For the selection of parameter ϑk, Babaie-Kafaki [4] came up with well-structured upper bounds for the condition numbers of the scaled memoryless quasi-Newton formulas based on eigenvalue analysis. It was then demonstrated that the scaling parameter suggested by Oren and Spedicato [14] is the only value that can be found as the lowest of the upper bound given for the condition number of the scaled memoryless BFGS update formula. On the other hand, according to Oren and Luenberger [15], the scaling parameter is the distinct lowest value of the provided upper bound on the condition number of the scaled memoryless DFP update algorithm.
According to [5], the choice of parameter ϑk is addressed by minimizing the given upper bound for the condition number of the formula (2.8). Since Wolfe line search (2.4) ensures sTkyk>0, we have sk≠0 and yk≠0. So a set of mutually orthogonal unit vectors q(i)kn−2i=1 exists for which sTkq(i)k=yTkq(i)k=0, yielding BAMBFGSk+1q(i)k, for i=1,2,...,n−2. Therefore, (n−2) eigenvalues of the matrix HAMBFGSk+1(BAMBFGSk+1) are equivalent to ϑk(1ϑk).
Using the relationship between the determinants and traces of the matrices HAMBFGSk+1 and BAMBFGSk+1, other eigenvalues of Bk+1 and Hk+1 can be obtained. The relationships are as follows:
Formulas of the two other eigenvalues of BAMBFGSk+1, namely, ρ+k and ρ−k, can be obtained,
for which 0<ρ−k<ρ+k. Additionally, since HAMBFGSk+1 is positive definite, the search direction of AMBFGS is descent. Furthermore, we have 0<ρ−k<1ϑk<ρ+k after performing a few algebraic operations. Therefore, ‖HAMBFGSk+1‖=ρ+k and ‖HAMBFGSk+1−1‖=ρ−k. Now, from (2.9) and (2.10) we can write
where κ(⋅) expresses the spectral condition number.
It is generally acknowledged that decreasing the condition number in matrix-based computing can enhance the stability of numerical calculations [16]. From (2.12), we derive the minimizer of the upper bound of κ(HAMBFGSk+1) as follows:
Now, we present a new augmented memoryless BFGS algorithm for solving unconstrained optimization problems.
Algorithm 2.1. The AMBFGS algorithm
Step 1. Choose an initial point x0, choose parameters 0<δ<ρ<1, ϵ>0, ϵ1>0. Set H0=I, d0=−H0g0 and k:=0.
Step 2. If ‖gk‖<ϵ, stop; otherwise, go to Step 3.
Step 3. Compute the step size αk, so that it satisfies Wolfe line search (2.3) and (2.4). Set xk+1=xk+αkdk.
Step 4. Compute ϑk, if ϑk<ϵ1, let ϑk=sTkyk‖yk‖2, otherwise, obtains ϑk by (2.13).
Step 5. Update Hk+1 by (2.8) and compute the search direction dk using (1.4).
Step 6. Set k:=k+1 and go to Step 2.
Remark 2.1. Step 4 is set this way in order to avoid ϑk falling into a dilemma during the calculation of the algorithm.
3.
The descent property and global convergence
In this section, we demonstrate that the direction is sufficiently descent and the algorithm is globally convergent. Our analysis is based on the following assumptions.
Assumption 3.1. For arbitrary x0∈Rn, L={x∣f(x)≤f(x0)} is a bounded set and in some neighborhood U of L, ∇f(x) is Lipschitz continuous, that is, there exists a constant L>0 such that
Based on Assumption 3.1, we know that there is a positive constant Φ exists such that
Since AMBFGS directions are descent, from (2.4), we have {xk}⊂L.
The boundedness of the parameter ϑk in (2.13) is important. We will prove ϑk∈[m,M] in Lemma 3.1.
Lemma 3.1. Considering f is a uniformly convex function on a neighborhood U of L, the scaling parameter ϑk of the AMBFGS algorithm in (2.13) is well defined and bounded.
Proof. Let f be uniformly convex on U, then, by Theorem 1.3.16 of [1], for any x,y∈L, we have
and
where v>0 is a constant. Let y=xk+1, x=xk, adding (3.3) and (3.4), we can obtain
In this paper, sk=xk+1−xk,yk=∇f(xk+1)−∇f(xk), and then,
Through (3.1) and (3.5), we can get
By mean value theorem, (1.9) can be rewritten by
where θk=hxk+(1−h)xk+1,h∈(0,1). Hence, by (3.1) we have that
So, we can get 0<τk≤τLv. In this case, using (3.6) and the definition of ϑk in (2.13), we have
Moreover,
From (3.9) and (3.10), we have
which shows the boundedness of ϑk.
Remark 3.1. In Step 4 of the algorithm, when ϑk<ϵ1, we set ϑk=sTkyk‖yk‖2. Then, by applying Eq (3.5), it can be deduced that ϑk is bounded.
The next lemma states an effective property of the direction (1.4).
Lemma 3.2. Let f be uniformly convex on the neighborhood U of L, then search direction {dk} produced by Algorithm 2.1 is sufficient descent, that is
Proof. By carefully studying the proof of Lemma 3.6 of [17], we can show that tr(BAMBFGSk+1) is bounded.
By Lemma 3.1, we get sTkyk≥v‖sk‖2,∀k≥0 and |ˇη|≤L‖sk‖2. So, considering (2.7), (3.1), (3.2), (3.5) and (3.11), we have
So, from (1.4) and (3.13), we have
and
Finally, according to (3.14) and (3.15), let
then (3.12) is established and the proof is complete.
We next consider the convergence of AMBFGS algorithm. For this purpose, we make the following additional lemma.
Lemma 3.3. Suppose that Assumption 3.1 holds. Consider iterative form xk+1=xk+αkdk, where αk satisfies the Wolfe conditions (2.3) and (2.4) and dk satisfies the sufficient descent condition (3.12). If
then,
Proof. Since dk is sufficiently descent by (3.12) and αk satisfies the Wolfe conditions (2.3) and (2.4), the Zoutendijk condition[20]
holds (see Theorem 3.2 of [18]). To prove this lemma by contradiction, we suppose that there exists a positive constant χ such that
Inequalities (3.12) and (3.20) yield gTkdk≤−ζ‖gk‖2≤−ζχ2, which implies ζ2χ4χ4≤(gTkdk)2‖dk‖2. It follows from the above inequality and (3.19) that
Since this contradicts the Zoutendijk condition (3.19), the proof is complete.
Theorem 3.1. Suppose f is uniformly convex on the neighborhood U of L, then the Algorithm 2.1 converges in the sense that (3.18) holds.
Proof. Lemma 3.2 shows that dk≠0,∀k>0, therefore, considering Lemma 3.3, it suffices to prove that ‖dk+1‖ is bounded.
From (2.8), (3.1), (3.2), (3.5)–(3.7) and (3.11), we can get
Hence, from (1.4) and (3.2), we get
Inequality (3.23) suggests that dk is bounded. Thus, by Lemma 3.3, we can conclude that the Algorithm 2.1 is convergent.
4.
Numerical experiments
In this section, we compare the computational efficiency of SMABFGS (provided by Aminifard et al. [6]), AMBFGS-OS (provided by Algorithm 2.1 and ϑk adopts the parameters in [14]) with AMBFGS (provided by Algorithm 2.1). All codes are written in Matlab 2017a and run on a Dell PC with 2.50 GHz CPU processor and 16 GB RAM memory as well as Windows 11 operation system.
We employ the effective Wolfe conditions with parameters ρ=0.99 and δ=10−4 in the implementations, as detailed in (2.3) and (2.4). When either k>10000 or ‖gk‖<10−6, all algorithms come to an end. The selection of τ=1,ϵ1=10−6 is made for the AMBFGS parameters, the selection of τ=1 and ϑk=sTkyk‖yk‖2 is made for the AMBFGS-OS parameters. Additionally, for SMABFGS, we set p=1, τ=1, and C=0.001, if ‖gk‖≥1, otherwise, p=3.
4.1. Experiment I: test for unconstrained optimization problems
For experiment Ⅰ, the 71 unconstrained problems are tested and compared, in which the 1–32 problems are taken from the CUTE library [21], and the others come from the unconstrained problem collections [30,31]. The number of iterations (Itr), the total number of gradient evaluations (Ng), CPU time (Tcpu), and the gradient value gk at the end of iteration are also reported in Table 1. The performance of these algorithms is visually described in terms of Tcpu, Itr, and Ng in Figures 1–3, respectively, using the performance profiles suggested by Dolan and Moré [19] (see [19] for further information). In general, the top curve indicates that the applicable approach is the winner for the interpretation of the performance profiles.
As can be seen from Table 1, the algorithm presented in the paper is clearly effective for solving most of the tested problems, and it is competitive with the other two algorithms in Itr, Ng, and Tcpu on the tested problems. Figures 1–3 also indicate that the numerical results of the AMBFGS algorithm are better than that of the SMABFGS algorithm and the AMBFGS-OS algorithm. Compared with the SMABFGS algorithm and AMBFGS-OS algorithm, the AMBFGS algorithm is generally in an advantageous position, has better numerical performance, and can solve large-scale unconstrained optimization problems quickly and effectively.
4.2. Experiment II: test for nonlinear equations
For experiment Ⅱ, we compare the performance of SMABFGS, AMBFGS-OS with AMBFGS in solving nonlinear equations, and the following mathematical model is considered:
Define F(x)=(F1(x),F2(x),…,Fn(x))T,x∈Rn and 7 problems are shown below.
Problem 1. [32] Set Fi(x)=exi−1, for i=1,2,…,nandx∈Rn.
Problem 2. [32] Set
and x∈Rn.
Problem 3. [32] Set
and x∈Rn.
Problem 4. [32] Set Fi(x)=(exi)2+3sinxicosxi−1, for i=1,2,…,n and x∈Rn.
Problem 5. [32] Set Fi(x)=(xi−1)2−1.01, for i=1,2,…,n and x∈Rn.
Problem 6. [33] Set
andx∈Rn.
Problem 7. [34] Set
and x∈Rn.
The number of iterations (Itr), the total number of gradient evaluations (Ng), CPU time (Tcpu), and the value Fk at the end of iteration are also reported in Tables 2–8. The performance of these algorithms is visually described in terms of Tcpu, Itr, and Ng in Figures 4–6, respectively, using the performance profiles suggested by Dolan and Moré [19]. In general, the top curve indicates that the applicable approach is the winner for the interpretation of the performance profiles. For each problem, we select 4 to 5 initial points from the following 7 points, that is, x1=(1,1,…,1)T, x2=(0.1,0.1,…,0.1)T, x3=(12,122,…,12n)T, x4=(0,1n,…,n−1n)T, x5=(1,12,…,1n)T, x6=(1n,2n,…,1)T, x7=(1−1n,1−2n,…,0)T.
As can be seen from Tables 2–8, the algorithm presented in the paper is clearly effective for solving most of the tested problems and is competitive with the other two algorithms in Itr, Ng, and Tcpu on the tested problems. Figures 4–6 also indicate that the AMBFGS algorithm, when compared with the SMABFGS and AMBFGS-OS algorithms, generally occupies an advantageous position. It exhibits better numerical performance and can solve nonlinear equations quickly and effectively.
5.
Conclusions
In this research, we presented an augmented memoryless BFGS algorithm based on a modified secant condition, which ensures a descent search direction. We determined the scaling parameter by reducing the upper bound of the condition number using an eigenvalue analysis. Global convergence of our approach has been demonstrated under appropriate assumptions. Finally, numerical results obtained by applying the AMBFGS method to solve large-scale unconstrained optimization problems and nonlinear equations demonstrate its encouraging efficiency, even when compared to the SMABFGS method and AMBFGS-OS method.
Author contributions
Yulin Cheng and Jing Gao: Methodology, Software, Visualization, Writing-original draft. All authors of this article have been contributed equally. All authors have read and approved the final version of the manuscript for publication.
Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This work is supported by the Scientific and Technological Developing Scheme of Jilin Province (YDZJ202101ZYTS167, YDZJ202201ZYTS303, 20230508184RC); the project of education department of Jilin Province (JJKH20210030KJ, JJKH20230054KJ); the doctoral research project start-up fund of Beihua University; the graduate innovation project of Beihua University (2023037).
Conflict of interest
All authors declare no conflicts of interest in this paper.