In the article, we present an elegant double inequality for the ratio of the zero-balanced hypergeometric functions, which improve and refine some previously known results and also give a positive answer the question by proposed by Ismail.
1.
Introduction
Consider the following system of equations:
where H:Rn→Rn is a continuous and monotone mapping and ψ⊆Rn is a non-empty closed and convex set. The monotonicity of H implies that it satisfies the following inequality:
In this paper, Rn and ‖.‖ refer to the real n-dimensional space and the Euclidean norm, respectively, while H(xk)=Hk.
The system described above arises in various fields of science and engineering, as many scientific formulations naturally lead to systems of nonlinear equations. These systems appear in mathematical research areas such as differential equations, optimization theory, and integral equations. They also serve as subproblems in generalized proximal algorithms [1].
In chemistry, such systems frequently arise in the determination of chemical equilibrium points, while in economics, equilibrium analysis is often conducted in a similar manner [2]. Some variational inequalities are reformulated as monotone systems before being solved [3]. Additionally, these systems have applications in compressed sensing, particularly in signal reconstruction and image de-blurring problems[2,4,5,6].
These systems also play a crucial role in solving optimal power flow equations [7]. In radiative transfer processes, they are applied through the discretization of the Chandrasekhar integral equation [2]. Furthermore, they are utilized in financial institutions for forecasting and automation [8].
Numerous methods and techniques have been developed, analyzed, and refined to solve system (1.1). Among these, the most widely used is Newton's method and its improved variants [9]. Newton's and quasi-Newton techniques are commonly employed for solving constrained monotone systems of nonlinear equations [9]. However, despite their popularity, the computational cost and storage requirements associated with the Jacobian matrix (the first derivative of the system) make Newton's method impractical for large-scale systems.
Quasi-Newton methods, in contrast, approximate the Jacobian matrix rather than computing it explicitly [10,11]. While these methods eliminate the need for exact derivative computations, they require substantial memory for matrix storage and are slower than Newton's method due to limited Jacobian information [10,11]. The steepest descent is among the popular descent methods for solving system (1.1). Although previously explored, it has largely fallen out of favor due to its slow convergence rate [12].
Given the challenges posed by large-scale systems, researchers have increasingly focused on conjugate gradient (CG) methods. As derivative-free approaches, CG methods are particularly well-suited for solving large-scale systems [13].
The CG methods are iterative schemes that generate a sequence of iterative points via
where αk is the step length, and dk is the search direction [14]. For CG methods, the search direction is defined as:
where sk−1=xk−xk−1, and βk is the CG parameter.
The CG parameter βk takes different forms depending on its formulation. For example,
which is known as the conjugate descent (CD) CG parameter [15].
The parameter βk is considered the most critical component of any CG direction, as different parameters lead to distinct CG directions with unique convergence properties and numerical performance. While some CG parameters are computationally efficient, others may fail to satisfy the inequality below, which is an essential condition for global convergence [2]:
The CD parameter βCDk belongs to the class of classical CG methods, exhibiting good convergence properties but weak numerical performance. To address this limitation, Yang et al. [16] hybridized the Liu-Storey (LS) and CD parameters using Wolfe's line search strategy, resulting in an efficient algorithm. Similarly, Kaelo et al. [17] hybridized the LS, Hestenes-Stiefel (HS), and Dai-Yuan (DY) CG parameters, producing an algorithm that satisfies (1.6). Using the strong Wolfe's line search strategy, the proposed algorithm achieved global convergence.
Snezana and Djordjevic [18] proposed a new hybrid of the CD and LS methods, ensuring that the hybrid parameter satisfies (1.6). Liu et al. [19] presented a scheme that also satisfies (1.6), introducing an LS-type CG parameter for solving the system (1.1) by applying Powell's strategy to the LS-type scheme [20]. Motivated by the method in [21], Wang et al. [22] developed a three-term CG algorithm for solving (1.1). Similarly, Li Wang [23], building on the FR-type CG iterative scheme, proposed an algorithm for symmetric systems by improving the method of Zhang et al. [24]. This method satisfies (1.6) and converges globally.
Liu and Li [25], inspired by the work of Yu et al. [26], presented a multivariate spectral algorithm for solving systems of the form (1.1). Acknowledging the weak numerical performance of the classical CD method, Zhang et al. [27] proposed a three-term CG direction satisfying (1.6) and the trust-region property. The proposed method achieved convergence through a modified Weak Wolfe-Powell (MWWP) line search strategy coupled with a projection algorithm, with numerical results outperforming many existing algorithms.
Recent studies have proposed various CG-based approaches to tackle constrained and unconstrained nonlinear equations. Ibrahim et al. [28] introduced two classes of spectral three-term derivative-free methods that effectively solve nonlinear equations. However, despite their efficiency, these methods struggle with ill-conditioned problems, leading to slow convergence. Similarly, Sabi'u and Sirisubtawee [29] proposed an inertial Dai-Liao conjugate gradient method tailored for convex constrained monotone equations, ensuring better stability and robustness. Nonetheless, inertial methods often require careful parameter tuning to prevent instability.
Additionally, Zheng et al. [30] developed a conjugate gradient projection method that efficiently handles equations with convex constraints. However, projection-based methods can suffer from increased computational cost, especially in high-dimensional problems [31]. Another notable contribution is by Chankong et al. [32], who formulated a class of derivative-free three-term descent Hestenes-Stiefel conjugate gradient algorithms for constrained nonlinear problems. While these algorithms demonstrate promising performance, they often encounter difficulties in non-smooth or highly nonlinear problems where descent conditions are hard to maintain. These recent advancements accentuate the increasing focus on improving CG-based methods by incorporating new techniques, such as inertial acceleration, spectral properties, and projection methods, to enhance convergence and stability.
Originally, classical CG methods were designed for solving systems of the form:
where f:Rn→R is a nonlinear and continuously differentiable function, and its second derivative exists. Given their desirable properties and the fact that the first-order optimality condition for (1.7), i.e., ∇f(x)=0, is equivalent to (1.1) whenever ∇f=Hk, all methods applied to (1.7) can be extended to tackle problems of the form (1.1).
2.
Motivation and derivation
To modify an existing conjugate gradient (CG) parameter or produce a new successful one, two factors must be considered:
● The new modified CG parameter must satisfy (1.6), which will ensure global convergence.
● It must outperform existing methods computationally by carrying out numerical comparisons with the relevant methods in the literature.
In [33], Yu and Guan presented a modified version of the DY CG parameter as follows:
In their effort to prove the convergence of the method, they showed that (1.6) is satisfied for τ=1−14b, where b≥14.
Following the same argument, a modified CD parameter was proposed in [33] and is presented as:
and the sufficient descent inequality was proved for b>14.
Inspired by Yu and Guan's schemes, Yu et al. presented the spectral version of Eqs (2.1) and (2.2) in [34], with CG parameters given by:
and
where b>O, but δk was left as a topic for further research.
The limitation of these four schemes (2.1)–(2.4) is that they do not satisfy the sufficient descent inequality (1.6) for the interval (0,14], and consequently, they cannot attain global convergence for b∈(0,14]. Furthermore, the value of b>0 was fixed throughout.
Motivated by the above and the fact that, to the best of our knowledge, modified CD schemes are limited in the literature, we aim to fill this gap by proposing a scheme that satisfies (1.6) and achieves global convergence for the interval bk∈(0,∞), where the value of bk>0 is not fixed.
Consider the following:
ξ>0 and bk>0 is a parameter to be determined.
By (1.4) and (2.5), we have
By Perry's idea in [35], (2.6) can be written as
Here, Qk is called the search direction matrix, which is expressed as
Qk is non-symmetric, it shall be symmetrized as follows:
Then (2.7) becomes
where Ak is the symmetric version of Qk in (2.7).
Theorem 2.1. The matrix Ak defined in (2.9) has eigenvalues λ+k, λ−k, and 1 with multiplicity (n−2), where
and
with
Moreover, the eigenvalues are real and positive.
Proof. The objective function Hk is not zero unless the solution is reached, as is the vector sk−1. Suppose the solution is not reached; then, both Hk and sk−1 are nonzero vectors, and hence, HTksk−1>0. Let V be the vector space spanned by {Hk,sk−1}. The dimensions of V and its orthogonal complement V⊥ are dim(V)≤2 and dim(V⊥)≥n−2, respectively.
Let a set of mutually orthogonal vectors {ιik−1}n−2i⊂V⊥ satisfy
Using (2.9) and (2.14), we obtain
The above equation is an eigenvector equation, and ιik−1, for i=1,2,…,n−2, is an eigenvector with a corresponding eigenvalue of 1, repeated n−2 times. Let the remaining eigenvalues be λ+k and λ−k. Furthermore, (2.9) can also be written as
It is clear that (2.16) is a rank-2 update matrix therefore; from Theorem 1.2.16 of [36], the relation
with a1=ξsk−1HTk−1sk−1, a2=Hk, a3=ξbk‖Hk‖2sk−1+HTk−1sk−1Hk(HTK−1sk−1)2 and a4=sk−1 can be used to get the determinant matrix of Ak as
Furthermore, it is known that the trace of (2.9) is equivalent to the total number of its eigenvalues; hence we have
and consequently,
Then by (2.13), (2.17), and (2.19), we obtain
Then, using the quadratic formula, we obtain
This gives (2.11) and (2.12). Next, we show that λ+k and λ−k are real and positive. The discriminant is positive since qk>p2k. Therefore, we need to have
λ+k is real and positive when
For λ−k to be real and positive, we require
After algebraic simplification, we obtain
Now, (2.25) can be written as
with ξ>0 and φ≤0.
□ Here, we present the proposed direction.
To improve its convergence attributes, bk in (2.26) can be modified to
Let the projection mapping be defined as follows before presenting the proposed algorithm.
Definition 2.1. Let ψ⊆Rn be a not empty, convex, and closed set. For x∈Rn, the projection of x onto ψ is given by
Equation (2.29) is called a projection mapping. The projection, Pψ, is nonexpensive, namely,
and consequently,
3.
Convergence analysis
This part presents the analysis of the proposed algorithm UMCD and some certain preassumptions.
Assumption 3.1. The map H is Lipschitz continuous, i.e., there is a constant L>0 such that
Assumption 3.2. H is strongly monotone, i.e., ∀x,y∈Rn, then, there exists c>0 such that
Assumption 3.3. Let a nonempty set ˉS, be the solution set of the system (1.1)
Lemma 3.1. Let ξ>0,bk>0 and φ≤0, then ∀k≥0 the sequence of search directions {dk−1} presented by (2.27) with (2.28) satisfies (1.6) where τ=1.
Proof. From (2.27), it is clear that for k=0, HT0d0=−‖H0‖2. This satisfies (1.6) with τ=1,
Case Ⅰ. For k>1, If HTksk−1>0 and HTk−1sk−1≥r‖Hk‖‖sk−1‖, dTkHk=−‖Hk‖2−ξ‖Hk‖2(HTksk−1)HTk−1sk−1−ξbk‖Hk‖2(HTksTk−1)2(HTk−1sk−1)2≤−‖Hk‖2.
Case Ⅱ. If HTksk−1=0, then dTkHk=−‖Hk‖2+‖Hk‖2(HTksk−1)max{−HTk−1sk−1,γ‖Hk‖‖sk−1‖} ≤−‖Hk‖2+‖Hk‖2(HTksk−1)γ‖Hk‖‖sk−1‖ ≤- ‖Hk‖2.
Case Ⅲ. If HTksk−1<0, then dTkHk=−‖Hk‖2+‖Hk‖2(HTksk−1)max{−HTk−1sk−1,γ‖Hk‖‖sk−1‖} ≤−‖Hk‖2+‖Hk‖2(HTksk−1)γ‖Hk‖‖sk−1‖ ≤- ‖Hk‖2. where (‖Hk‖2(HTksk−1)γ‖Hk‖‖sk−1‖<0), and hence the required result. □
Lemma 3.2. Let the assumptions hold and {xk},{zk} be sequences defined by the UMCD algorithm, then {xk},{zk} are bounded and
Proof. Let ˉx∈ˉS be the solution set of the system given by (1.1); then by the monotonicity of H, we obtain
Then, using Eq (2.32) (the line search condition) and the definition of zk, it follows
Also, from (2.34), we have
Hence,
Repeatedly, ‖xk−ˉx‖≤‖x0−ˉx‖ ∀k. This means that {xk−ˉx} is decreasing and {xk} is bounded. Furthermore, by (1)–(3) and (3.9), we get
Hence
Then from (2.28)
This implies
From (2.27) and using (2.28), (3.11), and (3.13), we have
If, HTksk−1>0 and HTk−1sk−1≥r‖Hk‖‖sk−1‖,
Let M1=N1+ξN1r+λN1r2.
Otherwise,
Let M2=N1+N1γ and take M=max{M1,M2}, then the Lemma (3.2) is proved.
The sequences {dk} and {xk} are bounded; Eq (2.33) shows that {zk} is bounded. Using the same argument as in (3.10), it implies that
Equation (3.8) shows that,
From (2.32), we obtain
By (3.8) and (3.18), we have
{xk−ˉx} converged and {H(zk)} is bounded; we can apply the limit on (3.17) to obtain
and hence
(3.19) and (2.33) indicates that (3.4) is true. However, it is deduced from the definition of λk that
This means Eq (3.5) is proved. □
Theorem 3.1. Let the assumptions (3.1)–(3.3), be true and the sequence {xk} be defined by the UMCD algorithm; then
Proof. By contradicting the above assumptions, i.e., suppose (3.23) is not true, then
Let ‖dk‖≠0 unless when ‖Hk‖=0, then there exists δ1>0 such that
If αk≠ξ, then αk, ρ−1αk does not agrees (2.32), i.e.,
Using (3.26) and (1.6), it yields
And from (3.27) we obtain
Equation (3.28) contradicts Eq (3.21). Hence, (3.23) is true. This completes the proof. □
4.
Numerical experiments
This section is divided into two parts. In this part, the numerical results of the proposed algorithm are presented. To measure the strength of the proposed algorithm (UMCD), the following algorithms were used:
● MDDYM (2021) algorithm proposed in [37].
● DTCG1 (2020) algorithm proposed in [6].
● ZANG (2023) algorithm proposed in [38].
● MCHCG (2024) algorithm proposed in [39].
In the second part of this study, we extend the UMCD algorithm to explore its effectiveness in compressive sensing experiments. To rigorously assess the performance of our proposed algorithm, we conduct a signal recovery experiment in conjunction with the CHCG algorithm [2]. The implementation is executed using MATLAB version 8.3.0 (R2014a) on a laptop equipped with a 1.80 GHz processor (CPU) and 8 GB of RAM.
For the UMCD algorithm, we selected the following key parameters: ξ=1, σ=φ=10−4, and ρ=ζ=0.9. In comparison, the parameters for the other algorithms are provided in their respective documents. This structured approach allows us to draw meaningful comparisons and insights regarding the performance of the proposed algorithm.
The iterations of all five algorithms were configured to stop when either |Hk|≤10−6, |H(zk)|≤10−6 are reached, or the number of iterations exceeds 2000 without meeting the stopping criterion. Moreover, the symbol '–' indicates a failure during the iterative process.
The numerical experiments for the five algorithms were carried out using eight (8) different initial points and twelve (12) test problems (T1–T12).
x1=(0.01,0.01,...,0.01)T, x2=(0.25,0.25,...,0.25)T, x3=(0.4,0.4,...,0.4)T, x4=(0.5,0.5,...,0.5)T, x5=(1.25,1.25,...,1.25)T, x6=(0.3,0.3,...,0.3)T, x7=(1,1,...,1)T, and x8=(0.1,0.1,...,0.1)T.
T1 [6]
H1(x)=ex1−1,
Hi(x)=exi+xi−1−1,i=2,3,4,…,n−1,
where ψ=Rn+.
T2 [5]
Hi(x)=log(xi+1)−xin,i=2,3,…,n−1,
where ψ={x∈Rn:n∑i=1xi≤n,xi>−1,i=1,2,3,…,n}.
T3 [37]
Hi(x)=2xi−sin|xi|, i=1,2,3,…,n,
where ψ=Rn+.
T4 [5]
Hi(x)=cosxi+xi−1. i=1,2,3,…,n,
where ψ=Rn+.
T5 [6]
Hi(x)=exi−1,i=1,2,3,…,n,
where ψ=Rn+.
T6 [37]
H1(x)=−2x1−x2+ex1−1,
Hi(x)=−xi−1+2xi−xi+1+exi−1,
Hn(x)=−xn−1+2xn+exn−1, i=2,3,4,…,n−1,
where ψ={x∈Rn:n∑i=1xi≤n,xi≥0,i=1,2,3,…,n}.
T7 [6]
H1(x)=x1−ecos(b(x1+x2)),
Hi(x)=xi−ecos(b(xi−1+xi+xi+1)), i=2,3,4,…,n−1,
Hn(x)=xn−ecos(b(xn−1+xn)),
b=1n+1, where ψ=Rn+.
T8 [37]
Hi(x)=xi−sin|xi−1|, i=1,2,…,n,
where ψ={x∈Rn:n∑i=1xi≤n,xi>−1,i=1,2,3,…,n}.
T9 [40]
Hi(x)=ex2i+32sin(2xi)−1,i=1,2,3,…,n,
where ψ=Rn+.
T10
H1(x)=cosx1−9+3x1+8ex2,
Hi(x)=cosxi−9+3xi+8exi−1, i=2,3,4,…,n,
where ψ=Rn+.
T11 Modification of T1 in [41]
H1(x)=esinx1−1,
Hi(x)=esinxi+xi−1−1,i=2,3,4,…,n−1,
where ψ=Rn+.
T12 Modification of T3 in [41]
Hi(x)=3xi−sinxi, i=1,2,3,…,n,
where ψ=Rn+.
Tables 1–5 display the performance of the proposed algorithm (UMCD) in comparison with recent CG algorithms: MDDYM [37] and DTCG1 [6]. In the tables, the terms *IP*, *Iter*, and *Fval* refer to the initial point, number of iterations, and function evaluations, respectively. Additionally, *Time(s)* and ‖Hk‖ represent CPU time and the norm of residual functions, respectively.
To ensure fairness, the five algorithms were run with the same initial starting points and dimensions. As observed from the tables, the UMCD algorithm solves all the test functions with the fewest number of iterations, CPU time, and function evaluations compared to the DTCG1 and MDDYM algorithms. Based on these results, the UMCD algorithm outperformed both MDDYM and DTCG1.
Table 6 presents the numerical comparison of the proposed algorithm with the existing CHCG and ZANG algorithms using problem Problems 11 and 12. In the table, significant performance of the proposed algorithm is observed over MCHCG and ZANG algorithms. In view of the above, the proposed method is considered as the optimal choice due to its excellence in the number of iterations and overall CPU time efficiency.
Furthermore, using Dolan and Morˊe performance profile [42], Figures 1–3 were plotted using the data from Tables 1–5. Each figure illustrates the percentage P(τ) of problems where a specific method works within a factor τ of the best possible execution time. Figure 1 compares the proposed algorithm's (UMCD) performance with that of the MDDYM and DTCG1 algorithms based on the number of iterations. Figures 2 and 3 compare the performances of the three algorithms based on CPU time and function evaluations, respectively. From the graphs, it is clearly evident that the proposed algorithm (UMCD) outperforms the other two algorithms across the three performance metrics used. Moreover, Figures 4–6 were plotted using the data from Table 6, all of which demonstrate the efficiency of the proposed algorithm over the compared algorithms. While the performance of the MIDSL algorithm is acceptable, the UMCD algorithm is considered the best choice when considering the overall performance.
4.1. Application of UMCD algorithm to signal recovery
Among the applications of the proposed algorithm is signal and image recovery in compressive sensing (CS). Compressed sensing, or compressive sensing, is concerned with the reconstruction of sparse signals from incoherent measurements. It asserts that some signals in real applications have a better representation in a specific domain after transformation, while the remaining ones are insignificant or zero. It replaces the traditional sampling technique, thereby reducing the number of samples for better signal sampling and reconstruction. The technique depends mainly on mathematical algorithms, especially derivative-free-based algorithms that require less storage. Therefore, CG algorithms are a good alternative in this respect. The CS theorem is present in biological engineering, medical sciences, and various fields of science and engineering [43,44]. It uses the following linear equation to recover the sparse signals:
x∈Rn, ϕ∈Rm, N∈Rmxn (m<<n) is a linear operator. The popular approach for solving the sparse solution is to minimize the relation below
τ is a positive parameter (regularized). Equation (4.2) is non-smooth due to the presence of the ℓ1-norm; therefore, derivative-free algorithms cannot be applied to solve it. As such, gradient projection for sparse reconstruction (GPSR) was proposed by Figueredo et al. [45] is among the best approaches for solving (4.2). In [45], problem (4.2) is written as
ui=(xi)+, and vi=(−xi)+ for i=1,2,...,n, where (.)+ denotes positive-part operator, which is defined as (x)+=max{0,x}. Using ℓ1-norm definition, it implies ‖x‖1=eTnu+eTnv, with en=(1,1,...,1)T∈Rn and problem (4.2) become
Figueredo et al., [45], demonstrate how to express problem (4.4) in a more formal bound quadratic programming problem as
where z=(uv), c=τe2n+(−bb), b=NTϕ, H=(BTB−BTB−BTBBTB). The matrix H is a positive semi-definite. Equation (4.5) can be transformed to a linear variable inequality (LVT), as follows:
The function Hk in (4.5) is a vector-valued function; it was shown to be Lipschitz continuous and monotone in Lemma 3 of [46] and Lemma 2.2 of [47]. Hence, our proposed algorithm, the UMCD algorithm, could be a good choice for solving it. Numerical comparison is carried out in comparison with the CHCG algorithm in signal reconstruction and recovery. The size of the signal is considered to be n=212 with k=210. The error, termed as means square error (MSE), is used to assess the quality of the restoration process; it is measured by
where ˆx refers to the original signal, and ˜x is the recovered signal. The experiment starts by initializing x0=NTϕ, conducting ten (10) trials, and recording the data of interest for each trial. It is set to stop when |fk−fk−1||fk−1|<10−5, where f(x)=12‖Nx−ϕ‖22+τ‖x‖1 is a merit function. The following information is recorded:
As reported in figures 7–8 and table 7, both algorithms performed well in recovering the signals with a reasonable degree of accuracy. But considering the number of iterations (ITER) and CPU time (TIMES) obtained by two algorithms; UMCD and CHCG as reported in table 7, the UMCD algorithm outperformed the performance of the CHCG algorithm.
5.
Conclusions
This paper presents "An improved convex constrained conjugate gradient descent method for nonlinear monotone equations with signal recovery applications". The proposed scheme satisfies the descent condition by conducting an eigenvalue analysis of the search direction matrix. Unlike the methods presented by Yu and Guan [33] and Yu et al. [34], which proved global convergence for the interval C∈(14,∞) only, the proposed method extends the interval to C∈(0,∞), and global convergence is proved. Furthermore, the numerical performance of the proposed algorithm, in comparison with MDDYM (2021) [37], DTCG1 [6], ZANG [38], and MCHCG [39] algorithms, confirms the efficiency of the proposed algorithm. Being derivative-free, the proposed algorithm is applied to compressive sensing for signal recovery problems. To measure the capacity of the proposed algorithm for signal recovery problems, the experiment is run together with the CHCG algorithm, a well-known algorithm for signal recovery. The results of the experiment are recorded and indicate the efficiency of the proposed algorithm.
Author contributions
H. A. conceptualized the study, developed the methodology, conducted theoretical analysis, and drafted the initial manuscript. A. K. A. and M. Y. W. supervised the research, validated the results, refined the proofs, and critically reviewed the mathematical framework. A. S. H. and K. A. refinement of proofs, implemented the algorithm, conducted computational experiments, and analyzed the data. I. A. R. M. literature review, manuscript draft, and funding. S. M. I. revised the manuscript, result interpretation, and manuscript editing. Y. B. M. provided insights into signal recovery applications and refined the results. E. M. N. verified computational accuracy, assisted with proofreading, formatting, and ensuring compliance with journal requirements. All authors read and approved the final manuscript.
Use of Generative-AI tools declaration
The authors declare that they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
The authors are grateful to Research Management Centre (RMC), Universiti Utara Malaysia through the geran KHAS with SO code 21777 and the Information Systems and Technology Department of the Kuwait Technical College, Kuwait, for their financial support.
This research is sponsored by the Institutional Based Research (IBR) of the Tertiary Education Trust Fund (TETFund), Sule Lamido University Kafin Hausa, Nigeria.
Conflict of interest
The authors declare there is no conflict of interests.