1.
Introduction
This paper aims at finding solutions to the following constrained nonlinear monotone equations:
in which F:Rn→Rn is monotone and continuously differentiable, and Ω⊆Rn is a nonempty closed convex set. The property of monotonicity for F is defined as
Nonlinear monotone equations emerge from different problems and are widely applied in interdisciplinary domains. For example, the automatic control systems [1], the compressive sensing[2,3,4], and the optimal power flow [5].
There are numerous iterative algorithms for solving (1.1). For instance, the Newton algorithm[6], the quasi-Newton algorithm[7], the Gauss-Newton algorithm[8], and the BFGS algorithm[9]. These algorithms exhibit good convergence properties from suitable initial guesses. Nevertheless, they are unsuitable for handling large-scale nonlinear equations since the computation of the Jacobian matrix or its approximation is required in every iteration.
Recently, some optimization methods that build on the projection and proximal point method presented by Solodov and Svaiter[10] have been extended to solve (1.1). These include conjugate gradient methods originally applied to solve large-scale unconstrained optimization problems due to the low memory and simple structures. For example, Li and Li [11] described a category of derivative-free approaches for systems of monotone equations. Cheng [12] formulated a PRP-type method using the line search strategy, and Awwal et al.[13] proposed a DFP-type derivative-free method. Ahookhosh et al.[14] developed two new derivative-free approaches for systems of nonlinear monotone equations. According to the RMIL conjugate gradient method presented by Rivaie et al [15], Fang[16,17] put forward a set of adjusted derivative-free gradient-type approaches for solving nonlinear equations. Depending on the hybrid idea and a new adaptive line search strategy, Liu et al. in [18] proposed a three-term algorithm to solve the nonlinear monotone equations under convex constraints. According to the spectral secant equation, Zhang et al.[19] developed a three-term projection method for the purpose of dealing with nonlinear monotone equations. Moreover, they utilized this method to solve the problem in signal processing. Abdullahi et al.[20,21] described a modified self-adaptive algorithm and a three-term projection algorithm for solving systems of monotone nonlinear equations. Furthermore, they employ these algorithms in signal and image recovery problems. Aji et al. [22] proposed a novel spectral algorithm using an inertial technique to solve a system of monotone equations and for its applications.
Motivated by the aforementioned works, we give a new conjugate gradient projection algorithm for nonlinear monotone equation (1). The structure of this paper is as follows. Section 2 is dedicated to proposing the new method. In Section 3, we conduct the convergence analysis of the newly proposed method. Section 4 showcases the numerical results obtained from solving monotone nonlinear equations. Subsequently, in Section 5, we apply the proposed method to address the signal recovery problem. Finally, Section 6 presents the conclusions. The symbol ‖⋅‖ stands for Euclidean norm, and ‖⋅‖1 stands for l1 norm.
2.
Algorithm
In this part, we initially review the conjugate gradient method that is employed to solve the following unconstrained optimization problems:
in which f is a smooth function mapping from Rn to R, and the gradient of f(xk) is symbolized as gk. The conjugate gradient method has the following iterative formula:
in which the step length αk is given through some line search techniques, and the search direction dk is derived from
In recent times, Rivaie et al. [15] proposed the RMIL conjugate gradient method (RMIL, named after its developers Rivaie, Mustafa, Ismail, and Leong), where the βk is computed by
Results from numerical experiments reveal that the method is highly efficient.
Motivated by the results of the RMIL method, we develop a derivative-free approach aimed at solving nonlinear systems of monotone equations. For the sake of simplicity, we substitute gk in equations (2.3) and (2.4) with Fk, where Fk=F(xk). Moreover, we change the coefficient of gk from -1 to −θk when k≥1. In this way, it can ensure that the search direction generated by the algorithm satisfies the property of sufficient descent, and it can effectively improve the computational efficiency of the algorithm. Consequently, the search direction dk is given by
where θk=βkFTkdk−1‖Fk‖2+1, βk=FTkyk−1‖dk−1‖2, Fk=F(xk), yk−1=Fk−Fk−1. From the definitions of dk, it is not difficult to see that for k∈N, we have
When k=0, we get FT0d0=−‖F0‖2.
Next we state the projection operator PΩ[.], which is described as follows:
This operation refers to projecting the vector x onto the closed convex set Ω. By doing so, it is guaranteed that the subsequent iterative point produced by our algorithm remains within the domain Ω. The projection operator is known to possess the non-expansive property, namely
From here on, our attention is directed towards the method for solving the monotone equations (1.1). We adopt the projection procedure presented in [10] and obtain a point
with the result that
where αk is obtained using a suitable line search technique.
Meanwhile, for any x∗ satisfying F(x∗)=0, owing to the monotonic property of F, we have
Incorporating the expression (2.10), we have the conclusion that the hyperplane
strictly divides the solutions of the equations (1.1) and xk. Consequently, Solodov and Svaiter [10] calculate the next iterate xk+1 through the projection of xk onto the hyperplane Hk. The xk+1 is defined as
Now, we give the steps of our algorithm.
Algorithm 2.1. :
Step 0: Choose σ>0,s>0,ϵ>0,1>ρ>0,2>γ>0. and the starting point x0∈Rn. Set k=0.
Step 1: If ‖F(xk)‖<ϵ, stop. Otherwise, move to step 2.
Step 2: Define the search direction dk through (2.5).
Step 3: Determine the steplength αk=max{sρi|i=0,1,2,⋯} satisfies
then set zk=xk+αkdk and move to step 4.
Step 4: If zk∈Ω and ‖F(zk)‖<ϵ, set xk+1=zk, stop. Else compute
and set k:=k+1, move to step 1.
3.
Convergence analysis
In this section, for the purpose of attaining the global convergence of Algorithm 2.1, the following assumptions are necessary.
Assumption 3.1. (1) The set of solutions for problem (1.1) is nonempty.
(2) The mapping F(⋅) is Lipschitz continuous on Rn, that is
in which L represents a positive constant.
From Assumption 3.1, it can be inferred that
in which κ is defined as a positive constant.
Lemma 3.1. Assume that Assumption 3.1 holds and the sequences {xk},{αk},{dk} are yielded in accordance with Algorithm 2.1. Then, for all x∗∈Ω that satisfies F(x∗)=0, and given that γ∈(0,2), we obtain
Furthermore, it holds that
Proof. Suppose F(x∗)=0 and x∗∈Ω; due to the monotonicity of F, we obtain
Combining with (3.5), we get
Thus, from (2.8), (2.10), (2.14), (2.15), and (3.6), we have
Then, we can deduce that
which together with γ∈(0,2), means that
□
Lemma 3.2. Assume that Assumption 3.1 holds and sequences {xk},{dk} are yielded in accordance with Algorithm 2.1.Then we have
in which κ>0,s>0,L>0,2>γ>0.
Proof. From (2.8) and step 4 of Algorithm 2.1, we obtain
For k∈N, using (2.5), (3.1), (3.2), (3.11), and step 3 of Algorithm 2.1, we have
From (2.5) and (3.2), we get
Combining with (3.7), we yield (3.10). □
Lemma 3.3. Assume that Assumption 3.1 holds and the sequences {xk}, {zk}, and {αk} are yielded in accordance with Algorithm 2.1; then we get
Proof. In the case where αk≠s, relying on the line search technique of Algorithm 2.1, it follows that ρ−1αk fails to satisfy (2.14), namely
Combining with (3.1), (3.2) and (3.10), we obtain
It follows from (2.6), (3.1), (3.10), (3.15), and (3.16) that we have
The above inequality shows that
This implies (3.14). □
Theorem 3.4. Assume that Assumption 3.1 holds and the sequences {xk} and {Fk} are yielded in accordance with Algorithm 2.1. Then we obtain
Proof. Assume that (3.19) is false. Then, there is a positive constant k for which
Based on (2.6) and (3.20), we obtain
Combining with Lemma 3.3, we obtain
which contradicts the conclusion (3.4) of Lemma 3.1, then (3.19) holds. □
4.
Numerical experiments
In the present section, we conduct a series of numerical experiments in order to assess the performance of Algorithm 2.1. We make a comparison between it and the three-term projection method put forward by Zhang et al.[19], the three-term conjugate method proposed by Gao et al.[23], and the PRP-like method presented by Abubakar et al.[24]. We conduct our tests using Matlab R2018a on a personal computer equipped with 32 GB of RAM and an 11th Gen Intel(R) Core(TM) i7-11700K processor.
We test the algorithms on ten problems that have been widely used in the literature for dimensions 50000 and 200000. All test problems are initialized with the following eight starting points: x1=(10,10,⋯,10)T, x2=(−10,−10,⋯,−10)T, x3=(−1,−1,⋯,−1)T, x4=(0.1,0.1,⋯,0.1)T, x5=(12,122,⋯,12n)T, x6=(1,12,⋯,1n)T, x7=(1n,2n,⋯,1)T, x8=(n−1n,n−2n,⋯,0)T, which are derived from [17,19,23]. With respect to every algorithm and each test problem, , the program concludes when any of the three conditions described below is achieved: (1) ‖F(xk)‖≤10−6, (2) ‖F(zk)‖≤10−6, (3) The algorithm still fails to converge after 1000 iterations. The parameters of methods introduced by Zhang, Gao and Abubakar are set as described in their respective articles. Regarding Algorithm 2.1, we define σ=10−4,s=1,ϵ=10−5,ρ=0.55,γ=1.2. At present, we are going to list out the test problems, where the mapping F has the following definition: F(x)=(f1(x),f2(x),⋯,fn(x))T.
Test Problem 1 This problem sourced from [23] is
Test Problem 2 This problem sourced from [23] is
Test Problem 3 This problem sourced from [23] is
Test Problem 4 This problem sourced from [19] is
Test Problem 5 This problem sourced from [19] is
Test Problem 6 This problem sourced from [19] is
Test Problem 7 This problem sourced from [12] is
Test Problem 8 This problem sourced from [25] is
Test Problem 9 This problem sourced from [25] is
Test Problem 10 This problem sourced from [25] is
In Tables 1, 2, and 3, we present the detailed outcomes using the format of "Niter/ Nfun/Time". Here, "Niter" represents the number of iterations, "Nfun" denotes the number of function value calculations, and "Time" stands for the CPU time counted in seconds. "Dim" indicates the dimension of test problems, and "Sta" represents the starting points. Through a comprehensive analysis of Tables 1, 2 and 3, it can be clearly observed that, for the given problems, Algorithm 2.1 registered the fewest values for Niter and Nfun in a large number of instances.
In an effort to comprehensively compare all algorithms, we utilize the performance profiles that were proposed by Dolan and Moré [26]. The performance profiles concerning the number of iterations are shown in Figure 1. The performance profiles for the number of function evaluations are displayed in Figure 2, while Figure 3 reveals the performance profiles of the CPU time. It is readily observable that the Algorithm 2.1 introduced in this paper outperforms the algorithms presented by Zhang, Gao, and Abubakar in terms of efficiency.
5.
Applications in sparse signal restoration
In this section, we make use of Algorithm 2.1 to tackle sparse signal restoration problems and compare its performance with that of Zhang[19], Gao[23], and Abubakar[24]. We begin by recalling the compressed sensing model. This model is utilized to retrieve sparse solutions or signals from underdetermined linear systems represented as Ax=b, where b∈Rm represents the data obtained from observations and A∈Rm×n(m≪n) denotes a linear mapping. Specifically, we can achieve sparse signal restoration through the solution of the following l1-norm regularized optimization problem
where τ acts as a constant that reconciles data fitting and sparsity. The vector x∈Rn can be decomposed as x=u−v,u≥0,v≥0, where u∈Rn,v∈Rn, and ui=max{xi,0},vi=max{−xi,0} for i=1,2,⋯,n. Therefore, we can replace problem (5.1) with the next programming formulation:
in which en=(1,1,⋯,1)T∈Rn.
Further, we obtain its standard form
in which
Evidently, matrix H has semi-positive definiteness; thus, Eq (5.3) is a convex quadratic problem. Based on first-order optimality conditions for (5.3), z qualifies as a minimizer if and only if it satisfies the subsequent equations:
in which F(z) is monotone and continuous. Henceforth, our proposed method may be employed for resolving (5.5).
During these experiments, the primary objective is to reconstruct a one-dimensional sparse signal with a length of n from m(m≪n) observations. The performance of signal restoration is evaluated using the mean squared error (MSE), defined as
in which x∗ represents the recovered signal and x indicates the initial signal. Let f(x)=12‖Ax−b‖22+τ‖x‖1 serves as the auxiliary function. The iterative sequence commences with the initial measurement signal x0=ATb and comes to an end when
During our experiments, we define r as the quantity of nonzero entries in x. For a specified set of values of m,n, and r, we generate the subsequent test data using MATLAB:
The numerical outcomes presented in Figure 4 contain the original sparse signal, the measurement data, and the reconstructed signals yielded by four algorithms. Clearly, all four algorithms reconstruct the original signal from the measurements with near perfection. However, Algorithm 2.1 outperforms the other algorithms in terms of MSE, the number of iterations, and CPU time. Figure 5 depicts four graphs to visually assess the performance of the four algorithms. These graphs display the convergence characteristics of the algorithms, depicting the changes in the values of the merit function and the MSE as the iterations progress and as the CPU time is consumed. Table 4 displays the numerical outcomes obtained from the experiment on ten diverse noise samples. Evidently, in a majority of cases, Algorithm 2.1 outperforms other algorithms. It achieves a lower MSE, demands a smaller number of iterations, and consumes less time on the CPU.
6.
Conclusions
This paper presents a derivative-free conjugate gradient method for the purpose of solving large-scale nonlinear systems of monotone equations that are bounded by convex constraints. The proposed approach is a novel expansion of the RMIL conjugate gradient method. It incorporates projection techniques, which play a crucial role in ensuring that the iterates remain within the feasible region defined by the convex constraints. This not only helps in maintaining the validity of the solution process but also improves the efficiency of the algorithm by avoiding unnecessary computations outside the constraint set. Additionally, the method features a line search strategy that operates without using derivatives. Under certain reasonable assumptions, we prove the global convergence property of the method. To further demonstrate the practical performance of our proposed method, we carry out extensive numerical experiments. The experimental results illustrate that the method we proposed has great promise. It outperforms several existing methods in terms of computational efficiency and accuracy when solving large-scale nonlinear systems of monotone equations. Furthermore, our numerical experiments on sparse signal restoration problems further validate the effectiveness and practicality of the method.
Acknowledgments
This research was supported by the Zhejiang Provincial Natural Science Foundation of China under Grant No. LY23A010007 and the Huzhou Natural Science Foundation under Grant No. 2023YZ29.
Conflict of interest
The author declares to have no competing interests.
Use of Generative-AI tools declaration
The author declares they have not used Artificial Intelligence (AI) tools in the creation of this article.