A hybrid approach to conjugate gradient algorithms for nonlinear systems of equations with applications in signal restoration

Xuejie Ma; Songhua Wang; Xuejie Ma; Songhua Wang

doi:10.3934/math.20241717

AIMS Mathematics

2024, Volume 9, Issue 12: 36167-36190. doi: 10.3934/math.20241717

Previous Article Next Article

Research article Special Issues

A hybrid approach to conjugate gradient algorithms for nonlinear systems of equations with applications in signal restoration

Xuejie Ma ¹,
Songhua Wang ^{2
,
,}

1.
School of Artificial Intelligence, Guangzhou Huashang College, Guangzhou 511300, China
2.
School of Mathematics, Physics and Statistics, Baise University, Baise 533099, China

Received: 16 September 2024 Revised: 05 November 2024 Accepted: 11 November 2024 Published: 27 December 2024
MSC : 65K05, 90C56

This paper proposes a novel hybrid PRP-HS-LS-type conjugate gradient algorithm for solving constrained nonlinear systems of equations. The proposed algorithm presents several significant advancements and key features: (i) the conjugate parameter is constructed by utilizing the hybrid technique; (ii) the search direction, designed with the conjugate parameter, possesses sufficient descent and trust region properties without the need for a line search mechanism; (iii) the global convergence is rigorously established under general assumptions, notably without the requirement of the Lipschitz continuity condition; (vi) numerical experiments demonstrate the algorithm's efficiency, particularly in solving large-scale constrained nonlinear systems of equations and addressing the sparse signal restoration problem.

Keywords:

constrained nonlinear systems of equations,
conjugate gradient method,
large-scale,
global convergence,
signal reconstruction

Citation: Xuejie Ma, Songhua Wang. A hybrid approach to conjugate gradient algorithms for nonlinear systems of equations with applications in signal restoration[J]. AIMS Mathematics, 2024, 9(12): 36167-36190. doi: 10.3934/math.20241717

Related Papers:

[1]	Habibu Abdullahi, A. K. Awasthi, Mohammed Yusuf Waziri, Issam A. R. Moghrabi, Abubakar Sani Halilu, Kabiru Ahmed, Sulaiman M. Ibrahim, Yau Balarabe Musa, Elissa M. Nadia . An improved convex constrained conjugate gradient descent method for nonlinear monotone equations with signal recovery applications. AIMS Mathematics, 2025, 10(4): 7941-7969. doi: 10.3934/math.2025365
[2]	Sani Aji, Poom Kumam, Aliyu Muhammed Awwal, Mahmoud Muhammad Yahaya, Kanokwan Sitthithakerngkiet . An efficient DY-type spectral conjugate gradient method for system of nonlinear monotone equations with application in signal recovery. AIMS Mathematics, 2021, 6(8): 8078-8106. doi: 10.3934/math.2021469
[3]	Yan Xia, Xuejie Ma, and Dandan Li . An improved LS-RMIL-type conjugate gradient projection algorithm for systems of nonlinear equations and impulse noise image restoration. AIMS Mathematics, 2025, 10(6): 13640-13663. doi: 10.3934/math.2025614
[4]	Xiaowei Fang . A derivative-free RMIL conjugate gradient method for constrained nonlinear systems of monotone equations. AIMS Mathematics, 2025, 10(5): 11656-11675. doi: 10.3934/math.2025528
[5]	Xiyuan Zhang, Yueting Yang . A new hybrid conjugate gradient method close to the memoryless BFGS quasi-Newton method and its application in image restoration and machine learning. AIMS Mathematics, 2024, 9(10): 27535-27556. doi: 10.3934/math.20241337
[6]	Yixin Li, Chunguang Li, Wei Yang, Wensheng Zhang . A new conjugate gradient method with a restart direction and its application in image restoration. AIMS Mathematics, 2023, 8(12): 28791-28807. doi: 10.3934/math.20231475
[7]	Jamilu Sabi'u, Ali Althobaiti, Saad Althobaiti, Soubhagya Kumar Sahoo, Thongchai Botmart . A scaled Polak-Ribi$ \grave{e} $re-Polyak conjugate gradient algorithm for constrained nonlinear systems and motion control. AIMS Mathematics, 2023, 8(2): 4843-4861. doi: 10.3934/math.2023241
[8]	Rabiu Bashir Yunus, Ahmed R. El-Saeed, Nooraini Zainuddin, Hanita Daud . A structured RMIL conjugate gradient-based strategy for nonlinear least squares with applications in image restoration problems. AIMS Mathematics, 2025, 10(6): 14893-14916. doi: 10.3934/math.2025668
[9]	Shousheng Zhu . Double iterative algorithm for solving different constrained solutions of multivariate quadratic matrix equations. AIMS Mathematics, 2022, 7(2): 1845-1855. doi: 10.3934/math.2022106
[10]	Hengdi Wang, Jiakang Du, Honglei Su, Hongchun Sun . A linearly convergent self-adaptive gradient projection algorithm for sparse signal reconstruction in compressive sensing. AIMS Mathematics, 2023, 8(6): 14726-14746. doi: 10.3934/math.2023753

Abstract

1. Introduction

1.1. Background overview

In numerous practical applications, such as neural network optimization ^[1,2], image segmentation ^[3], signal processing ^[4,5,6,7], matrix equations ^[8,9], and chemical equilibrium analysis ^[10], nonlinear systems of equations with convex constraints play an essential role. These applications often involve solving problems where the goal is to find solutions satisfying the convex constraints. Therefore, the research into efficient numerical methods for solving nonlinear systems of equations with convex constraints is not only of significant theoretical interest but also holds profound implications for practical applications. In this paper, we focus on the study of nonlinear systems of equations with convex constraints, which can be formulated as follows:

$\begin{equation} h(x) = 0, \quad x\in H, \end{equation}$

(1.1)

where $H\subseteq \mathbb{R}^n$ is a non-empty closed convex subset, and $h:\mathbb{R}^n\rightarrow \mathbb{R}^n$ is a continuous monotone mapping that may not necessarily be smooth. Throughout this paper, we denote $\|\cdot\|$ by the Euclidean vector norm.

At present, numerical iterative methods for solving nonlinear systems of equations can be broadly categorized into two major types. The first type includes methods that rely on the Jacobian matrix or its approximations, such as Newton's methods ^[11,12], quasi-Newton methods ^[13], trust-region methods ^[14], and Levenberg–Marquardt methods ^[15,16]. These methods are known for their ability to exhibit rapid local convergence, especially when an appropriate initial point is selected. However, the effectiveness of these methods depends on the computation and storage of the Jacobian matrix or its approximation value. This dependency can lead to algorithm failure or inefficiency, particularly in cases where the Jacobian matrix is difficult to obtain or when nonlinear systems of equations exhibits non-smooth characteristics. The second type includes methods that do not rely on the Jacobian matrix or its approximations, such as conjugate gradient (CG) methods ^[17,18,19], spectral gradient methods ^[20], and other derived first-order gradient-based methods ^[21]. These methods are known for being structurally simple and requiring only first-order derivative information and no matrix storage, which makes them particularly well-suited for solving large-scale nonlinear systems of equations. Due to their simplicity and scalability, these methods have gained popularity among researchers, especially in applications where computational resources are limited or where nonlinear systems of equations are too large for traditional Jacobian-based methods to be practical.

In recent years, the CG method has garnered significant attention from researchers in the field of optimization, becoming a focal point of study. As an efficient iterative method for solving linear systems of equations and unconstrained optimization problems, the CG method has demonstrated notable advantages, particularly in handling large-scale problems. The versatility of the CG method has been further enhanced through its combination with projection techniques, leading to the development of CG projection methods. For example, Liu et al. ^[22] proposed a new derivative-free spectral Dai–Yuan (DY) type projection method aimed at solving the problem (1.1). Their study, under reasonable assumptions, not only established the global convergence of the proposed method but also demonstrated its linear convergence rate. Besides, Hu et al. ^[23] developed a modified Hestenes–Stiefel (HS) type CG projection method. This method integrated the steepest descent approach with the traditional CG method and was applied to solve image restoration problems. Ma et al. ^[24] designed a modified inertial three-term CG projection method for solving the problem (1.1). This method integrated the inertial extrapolation step into the search direction, and was implemented for the sparse signal and image restoration problems.

The CG projection method updates its search direction using the most recent gradient information and the previous search direction. While this method generally performs well, it can encounter challenges such as slow convergence or unstable convergence performance in certain situations. To address these issues, the three-term CG projection method introduces a new approach for updating the search direction. This approach incorporates not only the current gradient information but also the gradient information from the previous two iterations. The key idea is to leverage a richer set of historical gradient information to enhance the search direction, with the aim of accelerating convergence and improving solution accuracy. Without loss of generality, the iterative formula for the three-term CG projection method is given by $x_{k+1} = x_k+\alpha_kd_k$ , where $\alpha_k$ is the step size determined by a line search mechanism. The search direction $d_k$ is defined as:

$\begin{equation*} d_k = \left\{ \begin{array}{ll} -h_0, & k = 0, \\ -h_k+\beta_k d_{k-1}+\theta_k y_ {k-1}, & k\geq1, \end{array}\right. \end{equation*}$

where $\beta_k$ is a conjugate parameter, $\theta_k$ is a scalar parameter, $h_k\triangleq h(x_k)$ , and $y_{k-1} = h_k-h_{k-1}$ . The parameters $\beta_k$ and $\theta_k$ play a critical role in the performance of the CG projection method, influencing the algorithm's convergence behavior. Traditional CG methods such as Polak–Ribière–Polyak (PRP), Hestenes–Stiefel (HS), and Liu–Storey (LS) are well-known in the field, with the conjugate parameter expressions given by:

$\begin{equation} \beta_k^\text{PRP} = \frac{h_k^\text{T}y_{k-1}}{\| h_{k-1}\|^2}, \quad \beta_k^\text{HS} = \frac{h_k^\text{T}y_{k-1}}{d_{k}^\text{T} y_{k-1}}, \quad \beta_k^\text{LS} = \frac{h_k^\text{T}y_{k-1}}{-h_{k-1}^\text{T} d_{k-1}}. \end{equation}$

(1.2)

Given the outstanding numerical performance of the PRP and HS CG methods, Yin et al. ^[25] introduced a hybrid three-term CG projection method for solving the problem (1.1), with a particular focus on compressed sensing problems. Additionally, Gao et al. ^[26] designed an adaptive three-term CG search direction and validated the effectiveness of the proposed method through extensive numerical experiments. Their results demonstrated that the adaptive approach is highly competitive, particularly in solving sparse signal problems.

1.2. Formulation of search direction

To provide a smoother and more coherent introduction to the ideas and algorithms presented in this paper, we begin by reviewing the relevant contributions of the works ^[27,28]. Existing three-term CG methods, specifically developed to address unconstrained optimization problems, have successfully demonstrated their remarkable efficiency and wide-ranging applicability in this field. From the well-established HS CG method, Li et al. ^[27] presented an innovative three-term HS-type CG method, which was designed for solving unconstrained optimization problems. Their conjugate and scalar parameters are defined as follows:

$\begin{equation} \beta_k^\text{THS} = \frac{h_k^\text{T}y_{k-1}}{d_{k-1}^\text{T} y_{k-1}}-\frac{\| y_{k-1}\|^2 h_k^\text{T}d_{k-1}}{(d_{k-1}^\text{T} y_{k-1})^2}, \quad \theta_k^\text{THS} = t_k \frac{h_k^\text{T}d_{k-1}}{d_{k-1}^\text{T} y_{k-1}}, \end{equation}$

(1.3)

where $t_k = \min\left\{ \hat{t}, \max\left\{ 0, {y_{k-1}^\text{T}(y_{k-1}-s_{k-1})}/{\|y_{k-1}\|^2} \right\} \right\}$ , $s_{k-1} = x_k-x_{k-1}$ , and $0 < \hat{t} < 1$ . In addition, Li et al. ^[28] proposed a three-term PRP-type CG method, also aimed at unconstrained optimization problems. In their approach, the corresponding conjugate and scalar parameters are modified by substituting $\|h_{k-1}\|^2$ with $d_{k-1}^\text{T}y_{k-1}$ , yielding the following parameters:

$\begin{equation} \beta_k^\text{TPRP} = \frac{h_k^\text{T}y_{k-1}}{\| h_{k-1}\|^2}-\frac{\| y_{k-1}\|^2 h_k^\text{T}d_{k-1}}{\| h_{k-1}\|^4}, \quad \theta_k^\text{TPRP} = t_k \frac{h_k^\text{T}d_{k-1}}{\| h_{k-1}\|^2}. \end{equation}$

(1.4)

By analyzing the denominators in (1.2), (1.3), and (1.4), we introduce a new notation $\delta_k = \mu\| d_{k-1}\|\| y_{k-1}\|+\max \left\{\| h_{k-1}\|^2, d_{k-1}^\text{T} y_{k-1}, -h_{k-1}^\text{T} d_{k-1} \right\}$ with $\mu > 0$ . Together with the construction forms of (1.3) and (1.4), we design a hybrid modified PRP-HS-LS-type search direction as follows:

$\begin{equation} d_k = \left\{ \begin{array}{ll} -h_0, & k = 0, \\ -h_k+\beta_k^\text{MPHL} d_{k-1}+\theta_k^\text{MPHL} y_ {k-1}, & k\geq1, \end{array}\right. \end{equation}$

(1.5)

where $\beta_k^\text{MPHL} = {h_k^\text{T}y_{k-1}}/{\delta_k}-{\| y_{k-1}\|^2 h_k^\text{T}d_{k-1}}/{\delta_k^2}$ and $\theta_k^\text{MPL} = t_k{h_k^\text{T}d_{k-1}}/{\delta_k}$ . This hybrid method is carefully constructed to ensure that the conjugate parameter maintains the desirable properties of the PRP, HS, and LS parameters. Additionally, it is designed to maintain sufficient descent and trust region properties without the need for a line search mechanism.

2. Description of the proposed algorithm

In this section, we will elaborate on the line search mechanism, the projection operator, and the detailed procedure of the proposed algorithm.

First, the line search mechanism ^[7,29] is employed to identify a suitable step size that ensures convergence towards the optimal solution. Specifically, the trial point $z_k = x_k+\alpha_k d_k$ is evaluated, where the step size $\alpha_k = \beta\rho^{i_k}$ . Here, $\beta$ is a predetermined scaling factor that initializes the search, and $\rho$ is a contraction factor that helps refine the step size iteratively. The integer $i_k$ is the smallest nonnegative integer $i$ such that the following condition holds:

$\begin{equation} - h(x_k+\beta\rho^id_k)^\text{T}d_k \geq \sigma \beta\rho^i \|h(x_k+\beta\rho^id_k)\| \|d_k\|^2, \end{equation}$

(2.1)

where the parameter $\sigma$ is typically chosen between 0 and 1, and it determines the strength of the sufficient decrease condition.

Second, the projection operator ^[7,24] is a fundamental tool in the design and theoretical analysis of various algorithms, particularly those addressing convex constrained optimization problems. The projection operator $P_H[\cdot]$ , which maps a point from the space $\mathbb{R}^n$ onto its non-empty closed convex subset H, is defined as:

$\begin{equation*} P_H[x] = \arg \min\limits_{y} \left\{\|x - y\|, y \in H \right\}, \; \; \; x\in \mathbb{R}^n. \end{equation*}$

This operator plays a critical role in maintaining feasibility within the constraint set $H$ . Moreover, $P_H[\cdot]$ possesses the non-expansive property, i.e., $\|P_H(x) - P_H(y)\| \leq \|x - y\|$ for all $x, y \in \mathbb{R}^n$ .

Finally, based on the concepts discussed above, we propose a hybrid modified PRP-HS-LS-type CG projection algorithm (Abbr. Algorithm MPHL). The step-by-step procedure of the proposed algorithm for solving the problem (1.1) is outlined below.

Algorithm MPHL

1: Input: An initial guess

$x_0\in \mathbb{R}^n$ and parameters

$\beta > 0$ ,

$0 < \rho < 1$ ,

$\sigma, \hat{t}, \mu, \epsilon > 0$ , with

$k: = 0$ .
2: while

$||h_k|| > \epsilon$ do
3: Compute

$\beta_k^\text{MPHL}$ and

$\theta_k^\text{MPHL}$ , and determine the search direction

$d_k$ by (1.5).
4: Compute the step size

$\alpha_k$ by (2.1) and calculate the trail point

$z_k$ .
5: if

$z_k\in H$ and

$||h(z_k)|| < \epsilon$ then
6: Set

$x_k: = z_k$ , and break.
7: else
8: Update the next iteration by

$\begin{equation} x_{k+1} = P_{H}\left[x_k-\gamma \chi_k h(z_k)\right], \; \; \chi_k = \frac{h(z_k)^\text{T} (x_k-z_k) }{||h(z_k)||^2}, \gamma\in (0, 2). ~~~~(2.2)\end{equation}$
9: end if
10: Increment

$k: = k+1$ .
11: end while
12: Output: Solution

$x_k$ .

3. Analysis of algorithmic properties

3.1. Properties of search direction

The following lemma demonstrates that the search direction $d_k$ satisfies two key properties: sufficient descent and trust region characteristics.

Lemma 1. If Algorithm MPHL generates a sequence $\{d_k\}$ , then the search direction $d_k$ satisfies the following conditions:

$\begin{equation} h_k^\mathit{\text{T}} d_k \leq -a_1\| h_{k}\|^2, \end{equation}$

(3.1)

$\begin{equation} a_1\| h_{k}\|\leq \| d_{k}\| \leq a_2\| h_{k}\|, \end{equation}$

(3.2)

where $a_1 = 1-{(1+\hat{t})^2}/{4}$ and $a_2 = 1+{1}/{\mu}+{1}/{\mu^2}+{\hat{t}}/{\mu}$ .

Proof. For $k = 0$ , (1.5) yields that (3.1) and (3.2) are obviously satisfied. For $k\geq1$ , multiplying both sides of (1.5) by $h_k^\text{T}$ , we obtain:

$\begin{equation*} \begin{array}{lll} h_k^\text{T} d_k & = & -\|h_{k}\|^2+\beta_k^\text{MPHL} h_k^\text{T}d_{k-1}+\theta_k^\text{MPHL} h_k^\text{T}y_ {k-1} \\ & = & -\|h_{k}\|^2+ \frac{(1+t_k)h_k^\text{T}y_{k-1}h_k^\text{T}d_{k-1}}{\delta_k}-\frac{\|y_{k-1}\|^2(h_k^\text{T}d_{k-1})^2}{\delta_k^2} \\ & = & -\|h_{k}\|^2+2\left(\frac{1+t_k}{2}h_k^\text{T}\right)\frac{y_{k-1}h_k^\text{T}d_{k-1}}{\delta_k} -\frac{\|y_{k-1}\|^2(h_k^\text{T}d_{k-1})^2}{\delta_k^2} \\ & \leq & -\|h_{k}\|^2 + \frac{(1+t_k)^2\|h_k\|^2}{4} + \frac{\|y_{k-1}\|^2(h_k^\text{T}d_{k-1})^2}{\delta_k^2} - \frac{\|y_{k-1}\|^2(h_k^\text{T}d_{k-1})^2}{\delta_k^2} \\ & = & -a_1 \|h_{k}\|^2, \end{array} \end{equation*}$

where the inequality holds with the relationship $2a^\text{T}b \leq \|a\|^2 + \|b\|^2$ . This shows that (3.1) holds.

Next, using the Cauchy-Schwarz inequality, (3.1) can be simplified to obtain $\|d_k\| \geq a_1 \|h_k\|$ . From the definition of $\beta_k^\text{MPHL}$ , we have:

$\begin{equation*} \left|\beta_k^{MPHL}\right| \leq \frac{\|h_{k}\|\|y_{k-1}\|}{\delta_k}+\frac{\|y_{k-1}\|^2\|h_{k}\|\|d_{k-1}\|}{\delta_k^2} \leq \frac{\|h_{k}\|\|y_{k-1}\|}{\mu\|d_{k-1}\|\|y_{k-1}\|}+\frac{\|y_{k-1}\|^2\|h_{k}\|\|d_{k-1}\|}{(\mu\|d_{k-1}\|\|y_{k-1}\|)^2} = \left(\frac{1}{\mu}+\frac{1}{\mu^2}\right)\frac{\|h_{k}\|}{\|d_{k-1}\|}. \end{equation*}$

Besides, from the definition of $\theta_k^\text{MPHL}$ , we have:

$\begin{equation*} \left|\theta_k^\text{MPHL}\right| \leq \frac{\hat{t}\|h_{k}\|\|d_{k-1}\|}{\mu\|d_{k-1}\|\|y_{k-1}\|} = \frac{\hat{t}\|h_{k}\|}{\mu\|y_{k-1}\|}. \end{equation*}$

Together with the above inequalities, (1.5) yields:

$\begin{equation*} \begin{array}{lll} \|d_{k}\|& = & \left\|-h_{k}+\beta_k^\text{MPHL}d_{k-1}+\theta_k^\text{TPRP}y_{k-1}\right\| \\ & \leq & \| h_{k} \| + \left|\beta_k^\text{MPHL}\right| \|d_{k-1}\| + \left|\theta_k^\text{TPRP}\right| \|y_{k-1}\| \\ & \leq & \| h_{k} \|+\left( \frac{1}{\mu}+\frac{1}{\mu^2} \right)\frac{\|h_{k}\|}{\|d_{k-1}\|}\|d_{k-1}\|+ \frac{\hat{t}\|h_{k}\|}{\mu\|y_{k-1}\|} \|y_{k-1}\|\\ & = & a_2 \| h_{k} \|. \end{array} \end{equation*}$

This shows that (3.2) holds. □

3.2. Properties of convergence

To systematically analyze the global convergence of Algorithm MPHL, we first introduce some general assumptions. Throughout the analysis, we assume that the sequences $\{x_k\}$ and $\{z_k\}$ generated by Algorithm MPHL are infinite. Otherwise, the last iteration is the solution of the problem (1.1). To provide a comprehensive understanding of the global convergence, we begin by describing these general assumptions:

(S1) The solution set of the problem (1.1) is non-empty, which is critical for ensuring that the optimization problem is well-posed and that the existence of solutions can be guaranteed under typical scenarios encountered in nonlinear systems.

(S2) The function $h(x)$ is monotone, meaning that $(h(x)-h(y))^\text{T}(x-y) \geq 0$ holds for any $x, y \in \mathbb{R}^n$ .

The following lemma plays a pivotal role in establishing the global convergence of Algorithm MPHL, providing the foundation for the subsequent analysis and results.

Lemma 2. Let sequences $\{x_k\}$ , $\{z_k\}$ , and $\{d_k\}$ be generated by Algorithm MPHL. Then the following statements are true:

$\rm{(i)}$ There exists a step size $\alpha_k$ such that the line search mechanism (2.1) is satisfied for all $k$ ;

$\rm{(ii)}$ The sequence $\{x_k\}$ is bounded;

$\rm{(iii)}$ As the iteration number $k$ approaches infinity, we have $\lim\limits_{k \rightarrow \infty} \alpha_k \|d_k\| = 0$ , meaning that the product of the step size $\alpha_k$ and the norm of the search direction $d_k$ tends to zero.

Proof. First, we aim to prove that statement (i) holds. We proceed by using a proof by contradiction. Assume that the line search mechanism (2.1) does not hold. In other words, there exists a constant $\tilde{k} > 0$ such that the following inequality is satisfied for any integer $i > 0$ :

$- h(x_{\tilde{k}}+\beta\rho^id_{\tilde{k}})^\text{T}d_{\tilde{k}} < \sigma \beta\rho^i \|h(x_{\tilde{k}}+\beta\rho^id_{\tilde{k}})\| \|d_{\tilde{k}}\|^2.$

By leveraging the continuity of $h$ , we can analyze the behavior of the inequality as $i \to \infty$ . Specifically, we note that as $i$ increases, the term $\beta\rho^id_{\tilde{k}}$ tends to dominate the argument of the function $h$ , leading to a more pronounced influence of $h$ at points increasingly distant from $x_{\tilde{k}}$ . This suggests that the values of $h(x_{\tilde{k}}+\beta\rho^id_{\tilde{k}})$ will converge to a certain limit determined by the continuity of $h$ and the direction of $d_{\tilde{k}}$ . Taking the limit as $i \to \infty$ , we obtain the following expression:

$\begin{equation*} - h(x_{\tilde{k}})^\text{T}d_{\tilde{k}} \leq 0. \end{equation*}$

On the other hand, from a previously established result given by (3.1), we know that:

$- h(x_{\tilde{k}})^\text{T} d_{\tilde{k}} \geq a_1 \|h_{\tilde{k}}\|^2,$

Therefore, we conclude that statement (i) is indeed valid.

Next, we aim to prove that statement (ii) holds. To facilitate the analysis, we begin by two key inequalities. From Assumption (S2), we have the following:

$\begin{equation} h(z_k)^\text{T} (x_k-x_*) = h(z_k)^\text{T} (x_k - z_k + z_k - x_*) \geq h(z_k)^\text{T} (x_k-z_k), \end{equation}$

(3.3)

where $x_*$ denotes an optimal solution such that $h(x_*) = 0$ . From the definition of $z_k$ and the condition given by (2.1), we have:

$\begin{equation} h(z_k)^\text{T} (x_k-z_k) = -\alpha_k h(z_k)^\text{T}d_k \geq \sigma \alpha_k^2 \|d_k\|^2 = \sigma \|x_k-z_k\|^2. \end{equation}$

(3.4)

Now, by utilizing the projection operator $P_{H}[x]$ , the definition of $\chi_k$ , and combining (3.3), and (3.4), we can derive the following inequality:

$\begin{equation} \begin{array}{cll} \|x_{k+1} - x_*\|^2 & \leq & \|x_k -\gamma \chi_k h(z_k) - x_*\|^2 \\ & = & \|x_k - x_*\|^2 - 2\gamma \chi_k h(z_k)^\text{T}(x_k - x_*) + \gamma^2 \chi_k^2 \|h(z_k)\|^2 \\ & \leq & \|x_k - x_*\|^2 - \gamma(2-\gamma) \chi_k^2 \|h(z_k)\| ^2\\ & \leq & \|x_k - x_*\|^2 - \gamma(2-\gamma) \sigma^2 \|x_k-z_k\|^4, \end{array} \end{equation}$

(3.5)

where the above equality holds for the sum of squares. The inequality (3.5) shows that the norm $\{\|x_k-x_*\|\}$ decreases at each iteration. Thus, we can conclude that the sequence $\{x_k-x_*\}$ is decreasing and converging to 0, implying that $\{x_k\}$ is bounded and converges to the optimal solution $x_*$ . Therefore, we conclude that statement (ii) is indeed valid.

Finally, we proceed to prove that statement (iii) holds. Starting with the inequality in (3.5), we reorganize it to derive the following expression:

$\gamma(2-\gamma)\sigma^2 \sum\limits_{k = 0}^{\infty} \|x_k-z_k\|^4 \leq \sum\limits_{k = 0}^{\infty}\left( \|x_k - x_*\|^2 - \|x_{k+1} - x_*\|^2 \right) \leq \|x_0-x_*\|^2.$

This inequality implies that the series $\sum\limits_{k = 0}^{\infty} \|x_k-z_k\|^4$ is bounded and convergent. Since $\|x_k-z_k\|^4 \geq 0$ , we can conclude that $\lim\limits_{k \to \infty} \| x_k-z_k \|^4 = 0$ . This, in turn, implies that $\lim\limits_{k \to \infty} \| x_k-z_k \| = 0$ . From the definition of $z_k$ , we know that $z_k = x_k+\alpha_kd_k$ . Hence, the convergence of $\| x_k-z_k \|$ to 0 implies $\lim\limits_{k \to \infty} \alpha _k \| d_k \| = 0$ . □

The global convergence properties of Algorithm MPHL are outlined in the following theorem.

Theorem 1. Let the sequence $\{h_k\}$ be generated by Algorithm MPHL. Then, the sequence satisfies the following convergence property:

$\lim\limits_{k \to \infty} \|h_k\| = 0.$

Proof. We begin by proceeding with a proof by contradiction. For the sake of contradiction, assume that there exists a positive constant $\vartheta$ such that $\|h_k\| > \vartheta$ holds for all $k \geq 0$ . This assumption, combined with the result from (3.2), implies that $\|d_k\| \geq a_1 \vartheta$ . Moreover, due to the continuity of the function $h(x)$ and the boundedness of the sequence $\{x_k\}$ , it follows that the sequence $\{h_k\}$ is also bounded. This means that there exists another positive constant $\iota$ such that $\|h_k\| \leq \iota$ . Utilizing this, together with (3.2), we can deduce that $\|d_k\| \leq a_2 \iota$ . Consequently, the sequence $\{d_k\}$ is bounded, which implies that $\lim\limits_{k \to \infty} \alpha_k = 0$ with Lemma 2 (iii).

Since both sequences $\{x_k\}$ and $\{d_k\}$ are bounded, there exists an infinite index set $\Gamma$ such that:

$\lim\limits_{j \to \infty, j\in\Gamma} x_{k_j} = \acute{x} \quad \lim\limits_{j \to \infty, j\in\Gamma} d_{k_j} = \acute{d}.$

Next, we consider the line search mechanism (2.1), which gives the inequality:

$- h(x_{k_j} + \rho^{-1} \alpha_{k_j} d_{k_j})^\text{T}d_{k_j} < \sigma \rho^{-1} \alpha_{k_j} \|h(x_{k_j} + \rho^{-1} \alpha_{k_j} d_{k_j})\| \|d_{k_j}\|^2.$

Taking the limit as $j \to \infty$ , we arrive at $-h(\acute{x})^\text{T} \acute{d} \leq 0$ . Furthermore, from the inequality derived in (3.1), we have:

$h_{k_j}^\text{T} d_{k_j} \leq -a_1\| h_{k_j}\|^2.$

Taking the limit as $j \to \infty$ , we conclude that $-h(\acute{x})^\text{T} \acute{d} \geq a_1 \|h(\acute{x})\| > 0$ . This directly contradicts the earlier result that $-h(\acute{x})^\text{T} \acute{d} \leq 0$ . Therefore, the assumption that $\|h_k\| > \vartheta$ for all $k \geq 0$ must be false, and we conclude that $\lim\limits_{k \to \infty} \|h_k\| = 0$ . □

4. Numerical experiment for constrained nonlinear systems of equations

4.1. General setups

This section focuses on applying the Algorithm MPHL to a set of widely recognized test problems and comparing its performance with three existing algorithms: Algorithm PSGM ^[22], Algorithm PDY ^[29], and Algorithm PCG ^[30]. The experiments were conducted on a system running Ubuntu 20.04.2 LTS 64-bit, powered by an Intel(R) Xeon(R) Gold 5115 2.40GHz CPU. This standardized computational environment ensures the reliability and consistency of the results, enabling a fair comparison between the algorithms.

The parameter settings for Algorithm PSGM, Algorithm PDY, and Algorithm PCG are configured according to their respective references. For Algorithm MPHL, the parameters are set as follows: $\beta = 1$ , $\rho = 0.74$ , $\sigma = 10^{-4}$ , $\gamma = 1.3$ , $\hat{t} = 10^3$ , $\mu = 2$ , $\epsilon = 10^{-6}$ . The test problems are formulated in the form $h(x) = (h_1(x), h_2(x), \ldots, h_n(x))^\text{T}$ with the variable $x = (x_1, x_2, \ldots, x_n)$ . For each test problem, the dimension $n$ is set to [10, 000 50, 000 100, 000 150, 000 200, 000]. The initial points are selected as follows: $x_1 = (1, 1, \ldots, 1)^\text{T}$ , $x_2 = (0.1, 0.1, \ldots, 0.1)^\text{T}$ , $x_3 = (\frac{1}{2}, \frac{1}{2^2}, \ldots, \frac{1}{2^{n}})^\text{T}$ , $x_4 = (2, 2, \ldots, 2)^\text{T}$ , $x_5 = (1, \frac{1}{2}, \ldots, \frac{1}{n})^\text{T}$ , $x_6 = (\frac{1}{n}, \frac{2}{n}, \ldots, \frac{n}{n})^\text{T}$ , and $x_7 = (\frac{n-1}{n}, \frac{n-2}{n}, \ldots, \frac{n-n}{n})^\text{T}$ . Specifically, each algorithm is terminated when $\|h_k\| \leq \epsilon$ or the number of iterations exceeded 2000. The following test problems are considered:

Problem 1. Set

$\begin{eqnarray*} h_1(x) & = & e^{x_1}-1, \\ h_i(x) & = & e^{x_i}+x_i-1, \; \; \; \text{for}\; \; i = 2, 3, \ldots, n, \end{eqnarray*}$