1.
Introduction
Consider a nonempty closed convex set Ω⊂Rn and a mapping F:Rn→Rn which is continuous and monotone. In this work, our interest is on finding the solution x∈Ω such that
System of nonlinear equation (1.1) arises in many practical applications such as in controlling the motion of a planar robot manipulator[1], economic equilibrium problems [2], power flow equations [3] and chemical equilibrium systems [4]. Additionally, in mathematics, subproblems in the generalized proximal algorithms with Bregman distance [5] and monotone variational inequality problems by using fixed point map or normal map [6,7] can all be transformed into finding the solution of (1.1). Recently, algorithms for solving system of nonlinear monotone equations are proved to be efficient in signal and image recovery [8].
Due to the existence of nonlinear equations in various fields and their wide range of applications, a lot of methods have been proposed to find their solution. Some of the early and popular iterative methods are Newton method, Quasi-Newton method, Levenberg-Marquardt method and their variants [9,10,11,12,13]. These early methods are characterised by their advantage of rapid convergence from good initial points. However, they require solving linear systems using a Jacobian matrix or its approximation at every iteration. This problem affects their suitability to solve large-scale systems of nonlinear equations.
On the other hand, conjugate gradient methods, spectral gradient methods, and spectral conjugate gradient methods are class of methods for solving large scale unconstrained optimization problems. Among the advantages of these methods are their simplicity in implementation and low storage requirements. These advantages stimulate researchers to extend these methods in order to solve nonlinear equations. For example, the projection technique proposed by Solodov and Svaiter [14] motivated many researchers to extend conjugate gradient methods from solving unconstrained optimization problem to solve system of nonlinear equations. Inspired by the work of Solodov and Svaiter in [14], Wang et al. [15] proposed a projection method for solving system of nonlinear monotone equations. In their algorithm, a linear system of equations is solved approximately, at each iteration, to obtain a trial point and then a line search strategy is performed along the search direction determined by the current point and the trial point with the aim of getting a predictor-corrector point. Hence, the algorithm computes its next iterate by projection. They proved the global convergence of the proposed method and presented some numerical experiments in order to show the performance of the algorithm. In [16], Cheng combined the classical PRP method with hyperplane projection method to solve nonlinear monotone equations. Xiao and Zhou [17] extended the well-known CG_Descent for unconstrained minimization problems to solve large scale convex constraint nonlinear monotone equations. They achieved this by combining the CG_Descent with the projection technique in [14]. They proved the global convergence of this method and showed the numerical performance. Liu and Li [18] modified the work in [17] and proposed another extension of the CG_descent method to solve nonlinear monotone equations. Based on the popular Dai-Yuan (DY) conjugate gradient parameter [19], Liu [20] proposed a spectral DY-type method for solving nonlinear monotone equations. The method can be viewed as a combination of the DY conjugate gradient method, spectral gradient method and the projection technique. They showed that the method converges globally and presented some numerical experiments. Later on, Liu and Li [21] developed another DY-type algorithm, which is a multivariate spectral method for solving (1.1). In their work, the direction uses a combination of the multivariate spectral gradient method and DY conjugate gradient parameter. The numerical experiments of the method is reported and the global convergence is also proved. However, restriction is imposed on the lipschitz constant L<1−r with r∈(0,1) before proving the global convergence. Motivated by this work, Liu and Feng [22] proposed another spectral conjugate gradient method for solving (1.1). Their work improved the computational effect of the DY conjugate gradient method and under some assumptions both the global and linear convergence of the method is proved. Most recently, a lot of algorithms have been developed for solving (1.1). Some of these algorithms can be found in [23,24,25,26,27,28,29,30,31,32].
In this paper, motivated by the work of Liu and Feng [22] on the modification of DY conjugate gradient method, we propose an efficient spectral conjugate gradient algorithm for solving systems of nonlinear monotone equations with convex constraint. The search direction in our proposed approach uses a convex combination of the DY parameter and a modified CD parameter. Specifically, this paper gives the following contributions:
● We propose an efficient spectral conjugate gradient algorithm for solving systems of nonlinear monotone equations with convex constraint by taking the convex combination of the DY parameter and a modified CD parameter.
● This algorithm can be viewed as an extension of the work proposed by Yu et al in [33].
● The global convergence of the proposed algorithm is proved under some suitable assumptions.
● The proposed algorithm is applied to recover a distorted signal.
The organization of the paper is as follows: In the next section, we introduce the details of the algorithm, some important definitions and prove global convergence. In the third section, we provide numerical experiments of the proposed algorithm and compare it performance with an existing one. Finally, we apply the algorithm in signal recovery, and give conclusion in the last section.
2.
Algorithm: Motivation and convergence result
In this section, projection map, its properties, and some important assumptions needed for the convergence analysis are introduced.
Definition 2.1. Let Ω⊂Rn be a nonempty, closed and convex set. The projection of any x∈Rn onto Ω is
The projection map has the following property:
Spectral and conjugate gradient algorithms generate sequence of iterates using the following formula:
where αk is called the step length, and the direction dk defined in spectral and conjugate gradient method respectively as:
and
Different spectral and conjugate gradient directions are developed using different choice of the parameters νk and βk respectively. To ensure the global convergence of these methods, the direction dk needs to satisfy the sufficient decent property. That is:
where τ>0.
One of the well-known parameter proposed in this direction is the DY conjugate gradient parameter in [19] defined as
such that the direction in (2.4) becomes:
where Yk−1=Fk−Fk−1. Unfortunately, (2.7) does not satisfy the decency property (2.5). As a result, Liu and Feng [22] modified the work in [19] and proposed a spectral conjugate gradient algorithm for solving (1.1). The direction in [22] satisfies (2.5) and thus, the global convergence is proved successfully under some appropriate assumptions.
Motivated by the work in [22], and due to the limited number of DY-type conjugate gradient methods in literature, we proposed a new spectral DY-type conjugate gradient algorithm for solving (1.1). Interestingly, the proposed direction satisfies the sufficient descent property (2.5), and uses a convex combination of the DY parameter and a modified CD parameter as follows:
where
βDYk is defined as (2.6), ~βk=‖Fk‖2max{−FTkdk−1,γ‖dk−1‖} and θk∈(0,1). Substituting the value of βDYk and ~βk in (2.8) we get
Throughout this work, the following assumptions are made.
● (Assumption1) The mapping F is monotone, that is,
● (Assumption2) The mapping F is Lipschitz continuous, that is there exists L>0 such that
● (Assumption3) The solution set of (1.1), denoted by Ω, is nonempty.
We state the steps of our proposed algorithm as follows:
Algorithm 2.2. Step 0. Choose initial point x0∈Ω, θk∈(0,1),κ∈(0,1],β∈(0,1)μ>1,σ,γ>0, δ∈(0,2), and Tol>0. Set k:=0.
Step 1. If ‖Fk‖≤Tol, stop, otherwise proceed with Step 2.
Step 2. Compute dk=−‖Fk‖,k=0 and
Step 3. Compute Λk=max{κβi:i=0,1,2,…} such that
Step 4. Set zk=xk+Λkdk. If ‖F(zk)‖=0, stop. Else compute
Step 5. Let k=k+1 and go to Step 1.
Remark 2.3. It is worth noting that when YTk−1dk−1≤μ‖Fk‖‖dk−1‖, our proposed search direction reduces to that of Yu et al. proposed in [33]. Thus, as a contribution, this work can be viewed as an extension of the work of Yu et al. [33].
We now state and proof the following Lemmas and Theorem for the convergence.
Lemma 2.4. The parameter νk given by (2.9) is well defined, and ∀k≥0,dk satisfies
Proof. Since F is monotone, then
which yields
Again, by Lipschitz continuity, we have
From (2.14) and (2.15) we get
Now, to show (2.13), for k=0,FTkdk=−‖Fk‖2, thus τ=1 and the result holds. When k≠0, If YTk−1dk−1≤μ‖Fk‖‖dk−1‖, then from (2.11),
using (2.16), we have
and (2.13) holds by taking τ=1L+r.
On the other hand, if YTk−1dk−1>μ‖Fk‖‖dk−1‖, multiplying (2.11) by FTk we obtain
The last inequality follows from (2.16). By letting τ=(1L+r−1μ)>0, the required result holds.
Lemma 2.5. Let {dk} be given by (2.11), then there are some constants p1>0,m1>0 and m2>0 for which
Proof. If YTk−1dk−1≤μ‖Fk‖‖dk−1‖,
Using (2.16), we have
where p1=1r. However, if YTk−1dk−1>μ‖Fk‖‖dk−1‖, then
where m1=p1+1μ,p1=1r and m2=1γ.
Lemma 2.6. Suppose (Assumption1) - (Assumption3) hold, then the sequences {xk} and {zk} generated by Algorithm 2.2 are bounded. Also,
and
Proof. Let ˜x be a solution of problem (1.1), using monotonicity from Assumption 1 we have
Using the above Eq (2.22) and xk+1 we obtain
Showing that ‖xk−˜x‖≤‖x0−˜x‖ for all k and hence {xk} is bounded and limk→∞‖xk−˜x‖ exists. Since {xk} {is} bounded, and F is Lipschitz continuous,
Using this and (2.18), we have
where n1=p1p and n2=m1p+m2p2 and taking M=min{n1,n2}, we have that the direction dk is bounded. That is
To prove that {zk} is bounded, we know that
and since we have proved that dk is bounded. This implies {zk} is also bounded. Again, by Lipschitz continuity,
Now from our line search (2.12), let min{1,‖F(xk+κβidk)‖1c}=‖F(xk+κβidk)‖1c, squaring from both sides of (2.12) we get
Also, since 0<δ<2, then from (2.23) we have
This together with (2.28) gives
Since limk→∞‖xk−˜x‖ exists and that (2.27) holds, taking the limit as k→∞ on both sides of (2.30) we have
but ‖F(zk)‖≠0, therefore,
Note that if the min{1,‖F(xk+Λkdk)‖1c}=1, then, (2.30) becomes
and thus (2.32) holds.
Using this and the definition of zk, we obtain
From the definition of projection operation, we get
Lemma 2.7. Suppose (Assumption2) holds, and the sequences {xk} and {zk} are generated by Algorithm 2.2. Then
Proof. From (2.12), if Λk≠κ, then ˆΛk=Λkβ−1 does not satisfy (2.12), that is,
Now let the min{1,‖F(xk+ˆΛkdk)‖1c}=‖F(xk+ˆΛkdk)‖1c. Using (2.13) and (Assumption2), we have
Therefore,
substituting ˆΛk=Λkβ−1 and solving for Λk we get
On the other hand, if min{1,‖F(xk+ˆΛkdk)‖1c}=1, then (2.38) reduces to
Combining (2.38) and (2.39), we get
Theorem 2.8. Suppose that (Assumption1-Assumption3) hold and let the sequence {xk} be generated by Algorithm 2.2, then
Proof. We prove by contradiction. Suppose (2.41) is not satisfied, then there {exists} α>0 such that ∀k≥0,
From Eqs (2.13) and (2.42), we obtain ∀k≥0,
We multiply ‖dk‖ on both sides of (2.36), and from (2.26) and (2.42), we get
Taking limit as k→∞ on both sides, we obtain
which contradicts Eq (2.32). Therefore,
3.
Numerical experiments
In this section, we give the numerical experiments in order to depict the advantages and the performance of our proposed algorithm (MDY) in comparison with the projected Dai-Yuan derivative-free algorithm (PDY) by Liu and Feng [22]. All codes are written on Matlab R2019b and run on a PC of corei3-4005U processor, 4 GB RAM and 1.70 GHZ CPU.
In MDY, the parameters are choosen as follows: r=0.001,θk=1/(k+1),μ=1.9,γ=0.9,σ=0.02,c=2,κ=1,β=0.70 and δ=1.1. The parameters in the PDY algorithm are maintained as exactly as they are reported in [22]. Based on this setting, we consider nine test problems with eight different initial points and tested them on five different dimensions, n=1000,n=5000,n=10000,n=50000 andn=100000. We used ‖Fk‖<10−6 as stopping criteria and denoted failure by "-" whenever the number of iterations exceeds 1000 and the stopping criterion is not satisfied. The test problems are listed below, where the function F is taken as F(x)=(f1(x),f2(x),…,fn(x))T.
Problem 1 [26].
Problem 2 [34] Modified Logarithmic Function.
Problem 3 [35] Nonsmooth Function.
Problem 4 [36]
Problem 5 [37] Strictly Convex Function.
Problem 6
Problem 7 [38] Tridiagonal Exponential Function
Problem 8 [22]
Problem 9 [26]
The results of our experiments are shown in Tables 1–9 based on the number of iterations denoted as (ITER), number of function evaluations (FVAL), CPU time (TIME), and the norm of the function (NORM) when the solution was obtained. Looking at the reported results, it can be observed that the proposed MDY {algorithm} outperformed the PDY algorithm in most of the problems by having the least ITER, FVAL and TIME.
In addition, to further visualize the comparison of the MDY algorithm with the PDY algorithm graphically, we adopt the well- known Dolan and Morè performance profile [39] as reported in Figures 1–3. From Figures 1 and 2, it can be seen that the MDY algorithm outperformed the PDY algorithm significantly. The MDY algorithm solves about 93% of the problems considered with least ITER and 99% of the problems with least FVAL as opposed to the PDY method with about 28% and 3% problems for the ITER and FVAL respectively. Moreover, in terms of TIME, Figure 3 indicated that the MDY algorithm still performs much better than the PDY algorithm by solving around 88% of the problems in lesser time. From these figures, we can conclude that the numerical performance of the MDY algorithm has great advantage when compared with the existing PDY algorithm.
4.
Application in compressive sensing
The problem of sparse signal reconstruction has attracted the attention of many researchers in the field of signal processing, machine learning and computer vision. This problem involves solving minimization of an objective function containing quadratic ℓ2 error term and a sparse ℓ1 regularization term as follows
where x∈Rn, h∈Rm, A∈Rm×n (m<<n) is a linear operator, ρ≥0, ‖x‖2 is the Euclidean norm of x and ‖x‖1=∑ni=1|xi| is the ℓ1−norm of x.
A lot of methods have been developed for solving (4.1) some of which can be found in [40,41,42,43,44,45]. Figueiredo et al. [43] consider reformulating (4.1) into a quadratic problem by expressing x∈Rn into two parts as
where ti=(xi)+, yi=(−xi)+ for all i=1,2,...,n, and (.)+=max{0,.}. Also, we have ‖x‖1=eTnt+eTny, where en=(1,1,...,1)T∈Rn. From this reformulation, we can write (4.1) as
from [43], Eq (4.2) can be written as
where z=(ty), c=ωe2n+(−aa), a=ATh, E=(ATA−ATA−ATAATA).
It is not difficult to see that E is a positive semi-definite showing that problem (4.3) is quadratic programming problem.
Xiao et al [17] further translated (4.3) into a linear variable problem, equivalently, a linear complementary problem and the variable z solves the linear complementary problem provided that it solves the nonlinear equation:
where A is a vector-valued function. In [8,46], it is proved that the function A(z) is continuous and monotone. Thus, problem (4.1) is equivalent to problem (1.1). Therefore, the algorithm we proposed in this work to solve (1.1) can efficiently solve (4.1) as well.
As an application, we consider applying our proposed algorithm in reconstructing a sparse signal of length n from k observations using mean squared error (MSE) as a metric for assessing quality reconstruction. The (MSE) is defined as
where s represents the original signal and ˜s the restored signal. We choose n=212,k=210 to be the size of the signal and the original signal contains 27 randomly nonzero elements. The measurement y contains noise, y=As+ω, where ω is the Gaussian noise distributed as N(0,10−4) and A is the Gaussian matrix generated by command randn(m,n), in Matlab.
We compared the performance of our proposed algorithm (MDY) with SGCS proposed in [8]. The parameters in SGCS are maintained as they are in [8], while in MDY we choose r=0.001,θ=1/(k+1)2,μ=1.1,γ=0.1,σ=0.01,Λ=1, and β=0.65. Each code is run with same initial point and continuation technique on parameter μ. We only focused on the convergence behaviour of each method to obtain a solution with similar accuracy. We initialized the experiments by x0=ATy and terminated the iteration when the relative change in the objective function satisfies
The performance of both MDY and SGCS are shown in Figures 4 and 5. Figure 4 shows that both the MDY and the SGCS methods recover the signal. However, looking at the reported metrics of each method, it can be observed that the MDY method is more efficient since it has a lesser MSE, and its recovery has fewer number of iterations and CPU time. To show the performance of both methods graphically, we plotted four graphs (see Figure 5) demonstrating the convergence behaviour of the MDY method and SGCS method based on the MSE, objective function, number of iterations and CPU time. From Figure 5, it can be observed that the proposed MDY method has faster convergence rate compared to the SGCS method. This shows that the MDY method can be a good alternative solver for signal recovery problems.
5.
Conclusions
In the work, a spectral conjugate gradient algorithm is proposed. The search direction uses a convex combination of the well known DY conjugate gradient parameter and a modified conjugate descent parameter. The search direction is suffiently descent, and global convergence of the proposed algorithm is proved under some assumptions. Numerical experiments are reported to show the efficiency of the algorithm in comparison with the PDY algorithm proposed in [22]. In addition, an application of the proposed algorithm is shown in signal recovery and the result is compared with SGCS algorithm proposed in [8]. Based on the results obtained, it can be observed that the proposed algorithm has a better performance than the PDY and SGCS {algorithms} in numerical and signal recovery {experiments} respectively. Future work include applying the new proposed algorithm to solve 2D robotic motion control as presented in [47].
Acknowledgments
The authors acknowledge the financial support provided by the Center of Excellence in Theoretical and Computational Science (TaCS-CoE), KMUTT and Center under Computational and Applied Science for Smart Innovation research Cluster (CLASSIC), Faculty of Science, KMUTT. {This research was funded by Thailand Science Research and Innovation Fund, and King Mongkut's University of Technology North Bangkok with Contract no. KMUTNB-BasicR-64-22.} The first author was supported by the Petchra Pra Jom Klao Ph.D. research Scholarship from King Mongkut's University of Technology Thonburi (KMUTT) Thailand (Grant No. 19/2562). Also, Aliyu Muhammed Awwal would like to thank the Postdoctoral Fellowship from King Mongkut's University of Technology Thonburi (KMUTT), Thailand. Moreover, this research project is supported by Thailand Science Research and Innovation (TSRI) Basic Research Fund: Fiscal year 2021 under project number 64A306000005.
Conflict of interest
The authors declare that they have no conflict of interest.