This research introduces min-max portfolio optimization models that incorporating transaction costs and focus on robust Entropic value-at-risk. This study offers a unified approach to handl the distribution of random parameters that affect the reward and risk aspects. Utilizing the duality theorem, the study transforms the optimization models into manageable forms, thereby accommodating the underlying random variables' discrete box and ellipsoidal distributions. The impact of transaction costs on optimal portfolio selection is examined through numerical examples under a robust return-risk framework. The results underscore the importance of the proposed model in safeguarding capital and reducing exposure to extreme risks, thus outperforming other strategies documented in the literature. This demonstrates the model's effectiveness in balancing maximizing returns and minimizing potential losses, making it a valuable tool for investors that seek to navigate uncertain financial markets.
1.
Introduction
Systems of nonlinear equations are fundamental to a diverse range of applications, including power flow analysis [1], economic equilibrium modeling [2], the development of generalized Bregman distance proximal point methods [3], and traffic assignment [4]. Meanwhile, these systems also involve monotone variational inequalities [5,6] and compression sensing problems [7,8]. Given the ubiquity and significance of such problems across these varied domains, the study and development of numerical methods to efficiently solve systems of nonlinear equations are of considerable practical importance. In this paper, we focus on a specific class of systems of nonlinear equations subject to convex constraints, which can be formulated as follows:
where Θ⊆Rn is a non-empty, and closed convex set. The function θ:Rn→Rn is assumed to possess monotonicity and continuous differentiability, satisfying the following condition:
Generally, the gradient-type method generates a sequence {ak}, defined as follows:
where tk is the step length, and dk denotes the search direction. The choice of the search direction dk gives rise to various gradient-type methods, such as the steepest descent methods, Newton's methods, and quasi-Newton methods [9,10]. Newton's and quasi-Newton methods, along with their numerous variants, have been extensively studied due to their strong local linear convergence properties. For instance, Mahdavi et al. [11] proposed and analyzed a nonmonotone quasi-Newton algorithm for strongly convex multiobjective optimization, demonstrating its global convergence and local superlinear convergence rate under certain conditions. Sihwail et al. [12] proposed a novel hybrid method, Newton-Harris hawks optimization, which combines Newton's methods and Harris hawks optimization to effectively solve systems of nonlinear equations. Moreover, Krutikov et al. [13] demonstrated that quasi-Newton methods, when applied to strongly convex functions with a Lipschitz gradient, achieve geometric convergence without relying on local quadratic approximations. However, despite these advantages, Newton's and quasi-Newton methods involve the computation of the Hessian matrix or its approximation value at each iteration, which significantly increases computational complexity. This requirement can be a limiting factor, particularly for large-scale problems where the Hessian matrix is difficult to compute and store efficiently.
The conjugate gradient method [14,15] is one of the most effective approaches in the field of the gradient-type methods. It is highly recognized for its efficiency, simplicity, lower storage requirements, and reliable convergence properties. These characteristics make it particularly well-suited for solving large-scale systems of nonlinear equations [16,17]. The method's search direction is typically defined as follows:
where θk≜θ(ak) and βk is known as the conjugate parameter. The choice of βk differentiates various conjugate gradient methods. Several advancements have been made in the development of conjugate gradient methods with different conjugate parameters [18]. For instance, Ma et al. [19] proposed a modified inertial three-term conjugate gradient method for solving nonlinear monotone equations with convex constraints. This method is notable for its global and Q-linear convergence properties, and has demonstrated superior numerical performance in applications such as sparse signal recovery and image restoration in compressed sensing. Furthermore, Liu et al. [20] introduced a spectral conjugate gradient method with an inertial factor for solving nonlinear pseudo-monotone equations over a convex set. Additionally, Sabiu et al. [21] developed an optimal scaled Perry conjugate gradient method for solving large-scale systems of monotone nonlinear equations. This method ensures global convergence under the conditions of monotonicity and Lipschitz continuity.
Inspired by the classical Liu-Story (LS) [22] and Rivai-Mohamad-Ismail-Leong (RMIL) [23] conjugate parameters, as well as incorporating the hybrid technique (e.g, [24,25]) and the projection approach, we develop a modified LS-RMIL-type conjugate gradient projection algorithm. The proposed algorithm is specifically designed for solving systems of nonlinear equations with convex constraints. In this paper, ‖⋅‖ and ⟨⋅,⋅⟩ represent the Euclidean norm and the inner product of vectors, respectively.
2.
Algorithm design
2.1. Search direction and its properties
In this section, we refine and enhance the search direction employed in the optimization processes of the LS and RMIL methods. Specifically, Liu et al. [22] and Rivai et al. [23] introduced conjugate parameters, defined respectively as:
Based on the insights derived from these parameters, we adopt the hybrid technique (e.g., [24,25]) that combines their key features. This leads to the formulation of a new conjugate parameter, which is subsequently incorporated into the framework of a three-term search direction. Our primary objective is to construct a novel search direction that ensures both the sufficient descent and trust-region properties, which are critical for the robustness and efficiency of the optimization process. To accomplish this, we carefully design a novel search direction tailored to meet these requirements. The designed search direction dk is defined as follows:
where the conjugate parameter βk and the scalar parameter ϖk are given by
with yk−1=θk−θk−1. One scalar parameter ck is crucial for maintaining stability in the iterative process, and is defined by
where μ is a positive constant. Another scalar parameter νk is introduced to fine-tune the adjustment of the search direction. It is defined as νk=min{˜ν,max{ˉνk,0}} with 0<˜ν<1, and ˉνk=⟨θk,yk−1−sk−1⟩‖θk‖2, where sk−1=ak−ak−1 represents the difference between the iterative points ak and ak−1 of the optimization variable.
Before delving into the sufficient descent and trust-region properties of the designed search direction (2.1), we can deduce some important bounds from the definition of βk and ϖk. We consider the bound for βk:
which can be further bounded by
Simplifying this, we obtain
Next, we consider the bound for ϖk:
Lemma 1. The search direction dk generated by (2.1) satisfies the sufficient descent property:
where M=1−14(1+˜ν)2.
Proof. For k=0, the conclusion is straightforward, which implies ⟨θ0,d0⟩=−‖θ0‖2≤−M‖θ0‖2. For k≥1, together with the search direction generated by (2.1), we have:
By applying the inequality ⟨ek,gk⟩≤12(‖ek‖2+‖gk‖2), where ek=(1+νk)θk/√2 and gk=√2⟨θk,dk−1⟩yk−1/ck, we obtain the following result:
Substituting (2.6) into (2.5), we obtain
Thus, the result holds. □
Lemma 2. The search direction dk generated by (2.1) satisfies the trust-region property:
where N=1+1μ+1μ2+˜νμ.
Proof. From Lemma 1, we have −‖θk‖‖dk‖≤⟨θk,dk⟩≤−M‖θk‖2, which implies:
Additionally, together with (2.1), we obtain:
Substituting the inequalities |βk| and |ϖk| defined in (2.3) and (2.4), respectively, into the above equality, we have
Thus, the result holds. □
2.2. Algorithm description
Before delving into the specifics of our proposed algorithm, it is essential to first clarify the line search approach, the projection operator, and the iterative update rule employed in the proposed algorithm. These foundational components play a crucial role in the overall efficacy of the proposed algorithm.
First, in the proposed algorithm, the line search approach is used to determine an appropriate step length tk=ηρik. Specifically, this step length is computed based on the following procedure, where ik={i:i=0,1,…} is the smallest non-negative integer i that satisfies the following inequality:
where η>0, ρ∈(0,1), and σ>0 are algorithmic parameters.
Furthermore, the projection operator PΘ[⋅] is a critical component that ensures the iterative points remain within the feasible region Θ. Specifically, the projection of a point a∈Rn onto the set Θ is defined as
This operator identifies the point in Θ closest to a in the Euclidean norm. Moreover, the projection operator is non-expansive, meaning it satisfies the property:
Finally, the iterative update rule forms the core of the proposed algorithm, indicating how the next iterative point ak+1 is computed from the current iterative point ak. Specifically, the update is performed by using the following formula:
where zk=ak+tkdk and γ∈(0,2). This projection-based update ensures that the new iterative point remains feasible and moves towards reducing the objective function.
With the foundational components described above, we now present the detailed steps of an improved LS-RMIL-type conjugate gradient projection algorithm (Abbr. ILR algorithm), which is described as Algorithm 1.
3.
Properties analysis
In this section, we provide a comprehensive analysis of the global convergence properties of the ILR algorithm. To facilitate this analysis, we introduce the following key assumptions.
Assumption B:
(B1) The solution set Θ∗ of problem (1.1) is non-empty.
(B2) The function θ(a) exhibits a monotonicity property, i.e.,
These assumptions are fundamental in establishing the convergence behavior of the ILR algorithm as they ensure that the iterative process converges to a solution within the feasible region of problem (1.1).
The following lemma demonstrates that the line search approach defined in (2.7) of the ILR algorithm is indeed well-defined and can be successfully applied in the iterative process.
Lemma 3. Consider the sequence {tk} generated by the ILR algorithm. Then, there exists a step length tk at each iteration that satisfies the line search approach defined in (2.7).
Proof. We proceed by contradiction and assume that inequality (2.7) does not hold. Specifically, suppose there exists a positive index k0 such that, for all i∈{0}∪N, the following inequality is satisfied:
By utilizing the continuity of θ and taking the limit as i→∞, the above inequality yields:
On the other hand, invoking Lemma 1 and again taking the limit as i→∞, we obtain:
which clearly contradicts inequality (3.1). This contradiction implies that the initial assumption must be false, and therefore inequality (2.7) must hold. □
The following lemma establishes that the sequence {ak} generated by the ILR algorithm exhibits monotonic behavior with respect to the solutions set Θ∗ of problem (1.1).
Lemma 4. Consider the sequences {ak} and {zk} generated by the ILR algorithm. Then, the following properties hold:
(i) The sequence {ak} is bounded, meaning that there exists a constant D>0 such that ‖ak‖≤D for all k≥0.
(ii) The sequence {zk} converges to the sequence {ak}, i.e., limk→∞‖zk−ak‖=0.
Proof. From the definition of the projection operator PΘ[⋅] and the non-expensive property defined in (2.8), we can derive the following inequality:
where a∗ denotes a solution of problem (1.1). Next, starting from Assumption B2, the definition of zk, and the line search approach (2.7), we can further establish the following inequality:
Combining with (3.2), (3.3), and the definition of wk, we can derive
Given the definition of wk and (3.3), we have
which implies that ‖θ(zk)‖wk≥σt2k‖dk‖2. Substituting this into (3.4), we obtain
This result indicates that the sequence {‖ak−a∗‖} is monotonically decreasing, meaning that it consistently reduces as k increases. Hence, the sequence {ak} is bounded.
By reorganizing the formula defined in (3.5), we obtain
This implies that limk→∞‖zk−ak‖=0. □
Theorem 1. Consider the sequence {θk} generated by the ILR algorithm. Then, the following conclusion is satisfied:
Proof. To demonstrate the desired result, we begin by assuming the contrary. Suppose that there exists a constant A1>0 such that ‖θk‖>A1 for all k≥0. This assumption, combined with Lemma 2, gives us the following relation ‖dk‖≥M‖θk‖>M A1 for all k≥0. Given the continuity of the function θ(a) and the boundedness of the sequence {ak}, it follows that the sequence {θk} is also bounded. In other words, there exists a non-negative constant A2 such that ‖θk‖≤A2 for all \(k \geq 0\). By incorporating this bound with Lemma 2, we obtain ‖dk‖≤N‖θk‖≤NA2 for all k≥0. The two inequalities derived above imply that the sequence {dk} is bounded. Together with Lemma 4(ii) and the definition of zk, we have limk→∞‖zk−ak‖=limk→∞‖ak+tkdk−ak‖=limk→∞tk‖dk‖=0, which leads to the conclusion that limk→∞tk=0 with the boundedness of the sequence {dk}.
Since the sequences {ak} and {dk} are both bounded, we can extract two convergent subsequences, {aki} and {dki}, such that limi→∞,i∈Kaki=ˉa and limi→∞,i∈Kdki=ˉd, where K denotes an infinite index set. Utilizing Lemma 1, we have −⟨θki,dki⟩≥M‖θki‖2. Taking the limit as i→∞ in the above inequality and invoking the continuity of θ(a), we obtain
Furthermore, we adopt the line search approach defined in (2.7), which implies the following inequality holds: −⟨θ(aki+(ηρ)−1tkidki),dki⟩<ση(ηρ)−1tki‖θ(aki+(ηρ)−1tkidki)‖‖dki‖2. Taking the limit as i→∞ in the above inequality, and using the continuity of θ(a), we conclude
These two results directly contradict each other. Therefore, the assumption that ‖θk‖>A1 for all \(k \geq 0\) must be false, and the desired result follows. □
4.
Numerical experiments for systems of nonlinear equations
In this section, we evaluate the effectiveness of the proposed ILR algorithm through a comprehensive set of numerical experiments. These experiments are designed to solve large-scale systems of nonlinear equations with convex constraints, thereby assessing the algorithm's computational efficiency. For benchmarking purposes, we compare the ILR algorithm with two established methods (e.g., VRMILP and DFPRPMHS) across various test problems, initial points, and dimensional settings.
4.1. General setups
In this section, we utilize the ILR algorithm to address large-scale systems of nonlinear equations with convex constraints. We then compare it with two existing algorithms: the VRMILP algorithm [26] and the DFPRPMHS algorithm [27]. All experimental codes are executed on a 64-bit Ubuntu 20.04.2 LTS operating system with an Intel(R) Xeon(R) Gold 5115 2.40GHz CPU. The parameters for the ILR algorithm are set as follows:
For the VRMILP and DFPRPMHS algorithms, we adhere to the parameter settings provided in their respective original works. Seven test problems are selected for evaluation, with problem dimensions set at {5,000 10,000 50,000 100,000 150,000}. Each test problem is initialized by the following points: a1=(12,122,…,12n), a2=(0,1n,2n,…,n−1n), a3=(1,12,…,1n), a4=(1n,2n,…,nn), a5=(13,132,…,13n), a6=(2,2,…,2) a7=(1−1n,1−2n,…,1−nn), a8∈[0,1]n. The stopping criteria for all algorithms is set to either θk≤τ or a maximum of 3000 iterations. Here, θ(a)=(θ1(a),θ2(a),…,θn(a))T with a=(a1,a2,…,an)T. The seven test problems are described as follows:
Problem 1 [7]:
with the constraint set Θ=Rn+. The unique solution is a∗=(0,0,…,0)T.
Problem 2 [7]:
with the constraint set Θ=Rn+.
Problem 3 [5]:
with the constraint set Θ=[−1,+∞).
Problem 4 [5]:
with the constraint set Θ=Rn+.
Problem 5 [5]:
with the constraint set Θ=Rn+.
Problem 6 [7]:
with the constraint set Θ=Rn+.
Problem 7 [5]:
with the constraint set Θ=Rn+.
4.2. Numerical reports
The performance of the ILR, VRMILP, and DFPRPMHS algorithms are systematically evaluated through a series of test problems, with the numerical results presented in Tables 1–7. In these tables, "Init" refers to the initial point used in each test problem, "n" refers to the problem dimension multiplied by 1000, "CPUT" refers to the CPU time in seconds, "Nfunc" refers to the number of function evaluations, and "Niter" refers to the number of iterations. A notable observation from the numerical results is that all three algorithms successfully solve the test problem across various initial points and problem dimensions. To be specific, the ILR algorithm demonstrates superior performance in most cases compared to the other two algorithms.
To provide a clearer characterization of the performance differences among the three algorithms, we adopt the performance profiles proposed by Dolan and Moré [28]. These profiles evaluate algorithmic behavior based on several key performance indicators, specifically the CPU time in seconds, the number of function evaluations, and the number of iterations. By plotting these indicators, the profiles offer a visual and comparative summary of algorithm efficiency. In these plots, a higher performance curve corresponds to better overall performance, making interpretation both intuitive and informative. By drawing these performance profiles for these three algorithms, we can visually assess and compare their efficiency, as shown in Figures 1–3. According to Figure 1, the ILR algorithm demonstrates significant efficiency, solving approximately 56% of the test problems with the lowest CPUT compared to the VRMILP and DFPRPMHS algorithms, which solve around 44% and 4% of the test problems, respectively. Similarly, Figure 2 shows that the ILR algorithm maintains its superior performance, solving approximately 75% of the test problems with the fewest Nfunc. In contrast, the VRMILP and DFPRPMHS algorithms solve about 26% and 7% of the test problems, respectively, with the least number of function evaluations. Lastly, Figure 3 further confirms the ILR algorithm's efficiency, solving approximately 53% of the test problems with the fewest Niter, while the VRMILP and DFPRRMHS algorithms solve around 37% and 20% of the test problems, respectively, with the fewest iterations.
Overall, these performance profiles highlight the ILR algorithm's effectiveness in solving large-scale nonlinear systems of equations with convex constraints, outperforming the VRMILP and DFPRPMHS algorithms across multiple performance metrics.
5.
Numerical experiments for impulse noise image restoration
In this section, we extend the evaluation of the proposed ILR algorithm to impulse noise image restoration problems. To validate the effectiveness of the ILR algorithm, we apply it to benchmark grayscale images subjected to varying levels of impulse noise.
5.1. General setups
Impulse noise image restoration is a critical topic in the field of image processing, particularly due to its importance in improving the quality of images corrupted by noise. Noise in images can be introduced through various sources, such as malfunctioning pixels in camera sensors, faulty memory locations in hardware, or transmission errors in communication channels. Common types of noise include Gaussian noise and impulse noise, with the latter often manifesting as salt-and-pepper noise. To address the challenge of removing impulse noise, Chan et al. [29] proposed a two-phase denoising scheme. This scheme combines the adaptive median filter (AMF) method with a variational method to effectively detect and restore noisy pixels.
Let m \times n denote the pixel size of an original image. The pixel locations are indexed by the set \mathcal{M} = \{1, 2, \ldots, m \} \times \{1, 2, \ldots, n\} . We denote the noise candidate set by \mathcal{N} \subset \mathcal{M} , and |\mathcal{N}| represents the number of elements in \mathcal{N} . In the first phase, noise detection is performed using an AMF. For a pixel located at (i, j) \in \mathcal{M} , the observed pixel value is denoted by y_{ij} , and the neighborhood of pixel (i, j) is defined as \mathcal{V}_{ij} = \{(i, j - 1), (i, j + 1), (i - 1, j), (i + 1, j)\} . The AMF detects noise by considering these neighborhood values. Once the noisy pixels are detected, the second phase involves the restoration of these pixels. This is achieved by minimizing the following regularization function:
where
Here, \beta is a regularization parameter, and \varphi_{\alpha}(\cdot) is an even edge-preserving potential function with parameter \alpha > 0 . The vector \mathbf{x} = [x_{i, j}]_{(i, j) \in \mathcal{N}} is optimized lexicographically to achieve denoising. The regularization problem posed in the second phase is nonsmooth due to the data-fitting term |x_{i, j} - y_{i, j}| . To address this, Cai et al. [30] proposed removing the nonsmooth term and instead solving the following smooth unconstrained optimization problem:
The potential function \varphi_{\alpha}(\cdot) plays a crucial role in preserving edges while smoothing the image. A commonly used potential function is the Huber function, defined as:
This function is convex and first-order Lipschitz continuous, making it suitable for the minimization problems described above. Let \nabla f_{\alpha}(\mathbf{x}) denote the gradient of the function f_{\alpha}(\mathbf{x}) . In alignment with Proposition 6 in [30], if \varphi_\alpha is convex, then \nabla f_{\alpha}(\mathbf{x}) is monotone.
5.2. Numerical reports
In this section, all parameters for these three algorithms are set as described in Section 4. The stopping criteria for these three algorithms are defined as follows:
For this experiment, we utilize the well-known grayscale test images Man ( 1024 \times 1024 ) and Tank2 ( 512 \times 512 ), which are sourced from the website https://www.hlevkin.com. We examine the performance of these three algorithms by applying them to images corrupted with 30% and 70% impulse noise. The noisy images, as well as the images recovered by these three algorithms, are shown in Figures 4 and 5. The corresponding numerical results are provided in Table 8. Based on these figures and the table, we can draw the following conclusions: (1) Figures 4 and 5 demonstrate that all three algorithms successfully recover the images affected by 30% and 70% impulse noise; (2) Recovering an image with 30% impulse noise requires less CPU time and fewer iterations compared to recovering an image with 70% impulse noise; (3) Among these three algorithms, the ILR algorithm generally requires less CPU time and fewer iterations than the VRMILP and DFPRPMHS algorithms for a given level of impulse noise.
6.
Conclusions
In this paper, we presented an improved LS-RMIL-type conjugate gradient projection algorithm aimed at efficiently solving systems of nonlinear equations with convex constraints. The proposed algorithm demonstrates several key advantages, including the ability to generate search directions that satisfy sufficient descent and trust-region properties independently of the line search approach. Additionally, the proposed algorithm only requires continuous and monotone assumptions for systems of nonlinear equations, which makes it applicable under less restrictive conditions compared to existing methods. We established the global convergence of the proposed algorithm without relying on the Lipschitz continuity assumption, further relaxing the conditions that need to be satisfied for successful implementation. Extensive numerical simulations, including large-scale systems of nonlinear equations and impulse noise image restoration problems, have shown that the proposed algorithm exhibits superior efficiency and stability compared to existing algorithms. These results indicate that the proposed algorithm is a promising and competitive approach, with significant potential for practical applications, such as image restoration.
Author contributions
Yan Xia: Conceptualization, Investigation, Writing–original draft, Writing–review and editing, Funding acquisition; Xuejie Ma: Writing–review and editing, Funding acquisition; Dandan Li: Conceptualization, Funding acquisition, Writing–review and editing. All authors have read and approved the final version of the manuscript for publication.
Use of Generative-AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
Acknowledgments
This work is supported by the Guangzhou Huashang College Daoshi Project (2024HSDS28).
Conflict of interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.