This paper extends a result of isolated calmness for nuclear norm regularized convex optimization problems to Ky Fan k-norm regularized convex optimization problems. We find that there exists a certain equivalence relationship among the critical cones of the Ky Fan k-norm function and its conjugate as well as the "sigma term", namely, the conjugate function of the parabolic second-order directional derivative of the Ky Fan k-norm. By establishing the equivalence between the primal (dual) strict Robinson constraint qualification (SRCQ) and the dual (primal) second-order sufficient condition (SOSC), we derive a series of complete characterizations of the robust isolated calmness of the Karush-Kuhn-Tucker (KKT) mapping for Ky Fan k-norm regularized convex matrix optimization problems. The obtained results enrich the stability theory of the Ky Fan k-norm regularized convex optimization problems and further enhance the usability of the related algorithms.
Citation: Ziran Yin, Chongyang Liu, Xiaoyu Chen, Jihong Zhang, Jinlong Yuan. A comprehensive characterization of the robust isolated calmness of Ky Fan k-norm regularized convex matrix optimization problems[J]. AIMS Mathematics, 2025, 10(3): 4955-4969. doi: 10.3934/math.2025227
[1] | Xinyu Qi, Jinru Wang, Jiating Shao . Minimax perturbation bounds of the low-rank matrix under Ky Fan norm. AIMS Mathematics, 2022, 7(5): 7595-7605. doi: 10.3934/math.2022426 |
[2] | Dejin Zhang, Shuwen Xiang, Xicai Deng, Yanlong Yang . Strongly essential set of vector Ky Fan's points problem and its applications. AIMS Mathematics, 2021, 6(4): 3160-3176. doi: 10.3934/math.2021191 |
[3] | Zhihua Wang . Stability of a mixed type additive-quadratic functional equation with a parameter in matrix intuitionistic fuzzy normed spaces. AIMS Mathematics, 2023, 8(11): 25422-25442. doi: 10.3934/math.20231297 |
[4] | Cunlin Li, Wenyu Zhang, Baojun Yang, Hooi Min Yee . A multi-player game equilibrium problem based on stochastic variational inequalities. AIMS Mathematics, 2024, 9(9): 26035-26048. doi: 10.3934/math.20241271 |
[5] | Min Wang, Jiao Teng, Lei Wang, Junmei Wu . Application of ADMM to robust model predictive control problems for the turbofan aero-engine with external disturbances. AIMS Mathematics, 2022, 7(6): 10759-10777. doi: 10.3934/math.2022601 |
[6] | Kaifang Liu, Lunji Song . A family of interior-penalized weak Galerkin methods for second-order elliptic equations. AIMS Mathematics, 2021, 6(1): 500-517. doi: 10.3934/math.2021030 |
[7] | Jian-Rong Wu, He Liu . The nearest point problems in fuzzy quasi-normed spaces. AIMS Mathematics, 2024, 9(3): 7610-7626. doi: 10.3934/math.2024369 |
[8] | Haixia Chang, Chunmei Li, Longsheng Liu . Generalized low-rank approximation to the symmetric positive semidefinite matrix. AIMS Mathematics, 2025, 10(4): 8022-8035. doi: 10.3934/math.2025368 |
[9] | Man Chen, Huaifeng Chen . On ideal matrices whose entries are the generalized $ k- $Horadam numbers. AIMS Mathematics, 2025, 10(2): 1981-1997. doi: 10.3934/math.2025093 |
[10] | Qin Zhong . New lower bounds of the minimum eigenvalue for the Fan product of several M-matrices. AIMS Mathematics, 2023, 8(12): 29073-29084. doi: 10.3934/math.20231489 |
This paper extends a result of isolated calmness for nuclear norm regularized convex optimization problems to Ky Fan k-norm regularized convex optimization problems. We find that there exists a certain equivalence relationship among the critical cones of the Ky Fan k-norm function and its conjugate as well as the "sigma term", namely, the conjugate function of the parabolic second-order directional derivative of the Ky Fan k-norm. By establishing the equivalence between the primal (dual) strict Robinson constraint qualification (SRCQ) and the dual (primal) second-order sufficient condition (SOSC), we derive a series of complete characterizations of the robust isolated calmness of the Karush-Kuhn-Tucker (KKT) mapping for Ky Fan k-norm regularized convex matrix optimization problems. The obtained results enrich the stability theory of the Ky Fan k-norm regularized convex optimization problems and further enhance the usability of the related algorithms.
Consider the Ky Fan k-norm regularized convex matrix optimization problem as follows:
minX∈Rm×nh(AX)+⟨C,X⟩+‖X‖(k),s.t.BX−b∈P, | (1.1) |
where h:Rd→R is a twice continuously differentiable function on domh, which is also strictly convex, A:Rm×n→Rd and B:Rm×n→Rl are two linear operators with m≤n, C∈Rm×n and b∈Rl are given data, P⊆Rl is a nonempty convex polyhedral cone, ‖⋅‖(k) denotes the matrix Ky Fan k-norm for an integer k∈[1,m], i.e., the sum of k largest singular values of a given matrix. In particular, when k=1, ‖⋅‖(1) coincides with the spectral norm function ‖⋅‖2, i.e., the largest singular values of matrices; when k=m, ‖⋅‖(m) is exactly the nuclear norm function ‖⋅‖∗, namely, the sum of singular value of matrices. It is well known that problem (1.1) has a wide range of applications in various fields, such as matrix Chebyshev polynomials [1], H∞ synthesis problems [2,3], control problems [4], deep learning and neural networks problems [5,6], matrix completion and rank minimization problems [7,8], matrix approximation problems [9], and fastest mixing Markov chain problems [10], etc.
Given the extensive application background of Problem (1.1), efficiently solving such problems has become a significant research focus. Stability analysis theory plays a crucial role in the convergence analysis of optimization algorithms [11,12,13]. Therefore, it is essential to study the stability properties of Ky Fan k-norm regularized optimization problems. Several studies in the existing literature have explored the stability analysis of Ky Fan k-norm regularized optimization problems. For example, Ding [14] studied the variational properties of the Ky Fan k-norm, laying the foundation for analyzing the optimality conditions and stability theory of Ky Fan k-norm regularized problems. Ding [15] provided a series of equivalent characterizations of the strong regularity of the KKT system for linear matrix cone programming involving the Ky Fan k-norm. Liu and Pan [16] gave sufficient conditions for the locally upper Lipschitz property of the KKT system for cone optimization problems induced by the Ky Fan k-norm epigraph. However, research on other stability properties, such as the characterization of isolated calmness, remains insufficiently comprehensive and mature. Therefore, this paper innovatively provides a comprehensive set of equivalent characterizations of the isolated calmness of the KKT mapping (ICKKTM) for Ky Fan k-norm regularized optimization problems.
The isolated calmness (see Definition 2.1), as one of the most important stability properties, can sometimes guarantee a linear convergence rate for some algorithms, rather than the generic sublinear rate, such as the alternating direction method of multipliers [17] and the proximal augmented Lagrangian method [18]. Numerous research works have been devoted to discovering the sufficient and even necessary conditions for the ICKKTM for optimization problems (cf. e.g., [16,19,20,21]). It is highly worth mentioning that in [21], Ding, Sun, and Zhang studied the robust ICKKTM for a large class of conic programming problems (when the constraint sets are C2-cone reducible), they proved that the KKT mapping is robustly isolated calm if and only if both the SRCQ and the SOSC hold. In addition, based on the special linear structure of problem (1.1) and its Lagrangian dual problem, more characterizations of the ICKKTM can be given from the dual perspective. For instance, Han, Sun and Zhang [22] found that for the convex composite quadratic semidefinite programming problem, the primal (dual) SRCQ and the dual (primal) SOSC are equivalent. Furthermore, they obtained five equivalent characterizations of the ICKKTM. Similarly, for the nuclear norm regularized convex optimization problem, Cui and Sun [23] also proved that the primal (dual) SRCQ is equivalent to the dual (primal) SOSC, thereby deriving a series of equivalent conditions of the ICKKTM. Motivated by [23], a natural question emerges: that is, whether the conclusions regarding the nuclear norm regularized problem can be extended to the Ky Fan k-norm regularized problem. We will give a positive answer in this paper.
Compared to existing works, the contributions of this paper are summarized as follows.
1) We originally discovered an ingenious relationship among the critical cone of the Ky Fan k-norm function, the critical cone of its conjugate, and the "sigma term, " which has not been mentioned in the existing literature. This relationship is essential for deriving the main results of this paper. Although the mathematical derivation is not excessively complex, it represents a novel and important finding.
2) We derived a more comprehensive set of equivalent characterizations of the ICKKTM for Ky Fan k-norm regularized convex optimization problems from the dual perspective, thereby enriching the stability analysis theory of Ky Fan k-norm regularized optimization problems.
3) The newly derived equivalent conditions are easier to verify in practical computations, which can enhance the feasibility of algorithm design for solving related problems.
4) The Ky Fan k-norm regularized optimization problem studied in this paper is a broader class than the nuclear norm regularized optimization problem. Therefore, the results we obtained have wider applications.
The content of this paper is organized as below. In Section 2, we provide some notations and preliminaries on the variational analysis of the Ky Fan k-norm function. We explore the connection among the critical cones of the Ky Fan k-norm and its conjugate function as well as the "sigma term" in Section 3, which forms the most essential part of this paper. In Section 4, we demonstrate that the primal (dual) SRCQ and the dual (primal) SOSC are equivalent. Furthermore, we obtain several equivalent characterizations for the ICKKTM. We conclude this paper in Section 5.
Below, we present some basic notations and symbols for matrices.
● For any positive integer t, we denote by [t] the index set {1,⋯,t}. For any Z∈Rm×n, the (i,j)-th entry of Z is denoted as Zij, where i∈[m], j∈[n]. Let μ⊆[m] and ν⊆[n] be two index sets. We use Zν to represent the sub-matrix of Z, which is obtained by retaining only all columns in ν, and use Zμν to represent the sub-matrix of Z, which is obtained by retaining only all rows in μ and columns in ν.
● For any d∈Rm, Diag(d) represents the m×m diagonal matrix, where the i-th diagonal element is di, i∈[m].
● Let "trace" represent the sum of diagonal elements in a given square matrix. For any two matrices M and N in Rm×n, the inner product of M and N is written as ⟨M,N⟩:=trace(MTN).
● Denote by Sw the linear space consisting of all w×w real symmetric matrices, and denote by Sw+ and Sw− the cones of all w×w positive and negative semidefinite matrices, respectively.
Let X and Y be two finite-dimensional real Euclidean spaces. We use the "X⇉Y" to denote a set-valued mapping from X to Y [24]HY__HY, Page 148]. Let D⊆X be a non-empty closed convex set. For any s∈D, the tangent cone (cf. [24, Definition 6.1]) of D at s is defined by TD(s):={d∈X|∃sk→swithsk∈Dandτk↓0s.t.(sk−s)/τk→d}. For a given x∈X, we define ΠD(x):=arg min{‖d−x‖|d∈D} as the projection mapping onto D. For any closed convex cone K⊂Y, denote by K∘:={s∈Y|⟨s,z⟩≤0,∀z∈K} the polar of K. Given ρ>0 and z∈X, define the ball Bρ(z):={z′∈X|‖z′−z‖≤ρ}. Given that f:X→(−∞,∞] is a proper closed convex function, we use f∗ to represent the conjugate of f and ∂f to represent the subdifferential of f. For the sake of convenience, we denote by θ the Ky Fan k-norm function in the subsequent text, i.e., θ(⋅):=‖⋅‖(k).
The following definition of isolated calmness, taken from [21, Definition 2], is the most important concept in this paper.
Definition 2.1. The set-valued mapping Φ:X⇉Y is said to be isolated calm at ˉu for ˉv if ˉv∈Φ(ˉu) and there exist a positive constant κ and neighborhoods U of ˉu and V of ˉv such that
Φ(u)∩V⊂{ˉv}+κ‖u−ˉu‖BY∀u∈U. | (2.1) |
Moreover, we say that Φ is robustly isolated calm at ˉu for ˉv if (2.1) holds and for each u∈U, Φ(u)∩V≠∅.
For any closed convex function l:X→(−∞,+∞], it is known from [25, Proposition 2.58] that l is directionally epidifferentiable. We use l↓(x;⋅) to represent the directional epiderivatives of l. Moreover, if l↓(x;d) is finite for x∈doml and d∈X, we define the lower second-order directional epiderivative of l for any w∈X by
l↓↓−(x;d,w):=lim infτ↓0w′→wl(x+τd+12τ2w′)−l(x)−τl↓(x;d)12τ2. |
Let W∈Rm×n be any matrix and Op be the set of all p×p orthogonal matrices. We use σ1(W)≥σ2(W)≥⋯≥σm(W) to denote the singular values of W with multiplicity being taken into account. Define σ(W):=(σ1(W),σ2(W),⋯,σm(W))T and Σ(W):=Diag(σ(W)). Suppose that W∈Rm×n has the singular value decomposition (SVD) as follows:
W=U[Σ(W)0]VT=U[Σ(W)0][V1V2]T=UΣ(W)VT1, | (2.2) |
where U∈Om and V=[V1V2]∈On with V1∈Rn×m and V2∈Rn×(n−m). Define three index sets as follows:
a:={i∈[m]∣σi(W)>0},b:={i∈[m]∣σi(W)=0},c:={m+1,…,n}. | (2.3) |
We know from [26, Theorem 31.5] that (P,Q) is a solution of the generalized equation
0∈−P+∂θ(Q)⟺0∈−Q+∂θ∗(P), |
if and only if
P−Proxθ(P+Q)=0⟺Q−Proxθ∗(P+Q)=0, |
where Proxθ:Rm×n→Rm×n and Proxθ∗:Rm×n→Rm×n are the Moreau-Yosida proximal mappings of θ (cf., e.g., [27, Definition 6.1]) and θ∗, respectively, i.e.,
Proxθ(Y):=argminY′∈Rm×n{θ(Y′)+12‖Y′−Y‖2},Y∈Rm×n. |
Let W=P+Q, and assume that W admits the SVD as in (2.2). By [27, Theorem 7.29 and Example 7.31], we know that
P=U[Σ(P)0]VT,Q=U[Σ(Q)0]VT. |
Denote ¯σ=σ(P) and ¯q=σ(Q). According to [14, Lemma 3], we know that if ¯σk>0, there exist two integers 0≤s0≤k−1 and k≤s1≤m such that
¯σ1≥…≥¯σs0>¯σs0+1=…=¯σk=…=¯σs1>¯σs1+1≥…≥¯σm≥0. |
Then,
¯qi=1,ifi∈α;0≤¯qi≤1,ifi∈βand∑i∈β¯qi=k−s0;¯qi=0,ifi∈γ, |
where
α={1,…,s0},β={s0+1,…,s1},γ={s1+1,…,m}. | (2.4) |
If ¯σk=0, there exists an integer 0≤s0≤k−1 such that
¯σ1≥…≥¯σs0>¯σs0+1=…=¯σk=…=¯σm=0. |
Then,
¯qi=1,ifi∈α;0≤¯qi≤1,ifi∈βand∑i∈β¯qi≤k−s0, |
where
α={1,…,s0},β={s0+1,…,m}. | (2.5) |
For convenience, we subdivide the index set β into three index sets:
β1:={i∈β∣¯qi=1},β2:={i∈β∣0<¯qi<1},β3:={i∈β∣¯qi=0}. | (2.6) |
Let P∈∂θ(Q) (or equivalently Q∈∂θ∗(P)). Define the critical cone of θ at P with respect to Q as
Cθ(P,Q)={D∈Rm×n∣θ′(P;D)=⟨Q,D⟩}. |
Likewise, define the critical cone of θ∗ at Q with respect to P as
Cθ∗(Q,P)={D∈Rm×n∣(θ∗)′(Q;D)=⟨D,P⟩}. |
For any two matrices Y,Z∈Rm×n, according to the well-known von Neumann's trace inequality [28], we have the following result:
⟨Y,Z⟩≤σ(Y)Tσ(Z). | (2.7) |
For any two matrices Y,Z∈Sp, by Fan's inequality [29], we have that
⟨Y,Z⟩≤λ(Y)Tλ(Z), | (2.8) |
where λ(Y) represents the eigenvalue vector of Y with its elements arranged in non-increasing order.
For the sake of notational convenience, for any integer p>0, we define a linear matrix operator S:Rp×p→Sp by
S(N):=12(N+NT). |
As the principal tools for the subsequent analysis, we need to utilize the characterizations of the subdifferential of θ, the critical cones of θ and its conjugate function, as well as the characterization of the sigma term being zero. These contents have already been provided in [14]. We summarize them as follows.
Lemma 2.1. Given P,Q∈Rm×n. Suppose that P∈∂θ(Q) and W=P+Q admit the SVD as in (2.2). For any D∈Rm×n, denote ˜D=UTDV. Then D∈Cθ(P,Q) is equivalent to the subsequent conditions:
(i) If σk(P)>0, then there exists some τ∈R such that
λ|β1|(S(˜Dβ1β1))≥τ≥λ1(S(˜Dβ3β3)) |
and
S(˜Dββ)=[S(˜Dβ1β1)000τI|β2|000S(˜Dβ3β3)], |
where I|β2| is the |β2|×|β2| identity matrix.
(ii) If σk(P)=0 and ‖Q‖∗=k, then there exists some τ≥0 such that
λ|β1|(S(˜Dβ1β1))≥τ≥σ1([˜Dbb˜Dbc]) |
and
[˜Dββ˜Dβc]=[S(˜Dβ1β1)0000τI|β2|0000˜Dbb˜Dbc]. |
(iii) If σk(P)=0 and ‖Q‖∗<k, then S(˜Dβ1β1)∈S|β1|+ and
[˜Dββ˜Dβc]=[S(˜Dβ1β1)00000000000]. |
Lemma 2.2. Given P,Q∈Rm×n. Suppose that Q∈∂θ∗(P) and W=P+Q admit the SVD as in (2.2). Assume that ‖Q‖∗(k)=1, where ‖⋅‖∗(k) denote the dual norm of ‖⋅‖(k). For any D∈Rm×n, denote ˜D=UTDV. Then D∈Cθ∗(Q,P) is equivalent to the subsequent conditions:
(i) If σk(P)>0, then trace(˜Dββ)=0,
S(˜Dα∪β1α∪β1)=[000S(˜Dβ1β1)]withS(˜Dβ1β1)∈S|β1|− |
and
[˜Dβ3β3˜Dβ3γ˜Dβ3c˜Dγβ3˜Dγγ˜Dγc]=[S(˜Dβ3β3)00000]withS(˜Dβ3β3)∈S|β3|+; |
(ii) If σk(P)=0 and ‖Q‖∗=k, then trace(˜Dβ1∪β2β1∪β2)+‖[˜Dbb˜Dbc]‖∗≤0,
S(˜Dα∪β1α∪β1)=[000S(˜Dβ1β1)]withS(˜Dβ1β1)∈S|β1|−; |
(iii) If σk(P)=0 and ‖Q‖∗<k, then
S(˜Dα∪β1α∪β1)=[000S(˜Dβ1β1)]withS(˜Dβ1β1)∈S|β1|−. |
Since the Ky Fan k-norm function θ is convex and Lipschitz continuous, and the singular value function in Rm×m is second-order directionally differentiable [30, Theorem 3.1], the lower second-order directional epiderivative of θ coincides with the parabolic second-order directional derivative of θ, i.e., for any P∈Rm×n, θ↓↓−(P;⋅)=θ″(P;⋅). For any given P∈∂θ(Q), ∀D∈Rm×n, denote
ψ∗(P,D)(Q):=(θ″(P;D,⋅))∗(Q) |
as the conjugate of θ″(P;D,⋅) at Q. This can be regarded as the "sigma term" of problem (1.1). Similarly, we denote
ϕ∗(Q,D)(P):=((θ∗)″(Q;D,⋅))∗(P) |
as the conjugate of (θ∗)″(Q;D,⋅) at P.
Ding [14, Proposition 16] provides the properties of the "sigma term" as well as the equivalent characterizations in the case where the "sigma term" equals 0, which are presented as follows:
Lemma 2.3. Given that P=∂θ(Q). For any D∈Rm×n, denote ˜D=UTDV=[˜D1˜D2] with ˜D1∈Rm×m and ˜D2∈Rm×(n−m). Then, ψ∗(P,D)(Q)≤0, ϕ∗(Q,D)(P)≤0. Moreover,
ψ∗(P,D)(Q)=0⟺ϕ∗(Q,D)(P)=0, |
which is also equivalent to the subsequent conditions:
(i) If σk(P)>0,
˜D=(S(˜D1)ααS(˜D1)αβ10000S(˜D1)β1αS(˜D1)β1β1S(˜D1)β1β2S(˜D1)β1β3000S(˜D1)β2β1S(˜D1)β2β2S(˜D1)β2β3000S(˜D1)β3β1S(˜D1)β3β2˜Dβ3β3˜Dβ3γ˜Dβ3c000˜Dγβ3˜Dγγ˜Dγc). |
(ii) If σk(P)=0,
˜D=(S(˜D1)ααS(˜D1)αβ1000S(˜D1)β1α˜Dβ1β1˜Dβ1β2˜Dβ1b˜Dβ1c0˜Dβ2β1˜Dβ2β2˜Dβ2b˜Dβ2c0˜Dbβ1˜Dbβ2˜Dbb˜Dbc). |
In this section, we present the relationships among the critical cone of the Ky Fan k-norm function, the critical cone of its conjugate, and the "sigma term", which are of vital significance in this paper.
Proposition 3.1. Assume that W∈Rm×n has the SVD as in (2.2). Denote P=Proxθ(W) and Q=Proxθ∗(W). Suppose the index sets α,β,γ,β1,β2,β3,b,c are defined by (2.3), (2.4) or (2.5) and (2.6). For any D∈Rm×n, denote ˜D=UTDV. Then the following conclusions hold:
(i) If σk(P)>0, then ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P) imply that D∈(Cθ(P,Q))∘.
(ii) If σk(P)=0, ‖Q‖∗=k, then ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P) imply that D∈(Cθ(P,Q))∘.
(iii) If σk(P)=0, ‖Q‖∗<k, then ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P)⟺D∈(Cθ(P,Q))∘.
Proof. Consider the following three cases.
Case 1. σk(P)>0. If ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P), by part (ⅰ) of Lemma 2.2 and part (ⅰ) of Lemma 2.3, we can obtain that
˜D=(0000000S(˜Dβ1β1)S(˜D1)β1β2S(˜D1)β1β3000S(˜D1)β2β1S(˜Dβ2β2)S(˜D1)β2β3000S(˜D1)β3β1S(˜D1)β3β2S(˜Dβ3β3)00000000) | (3.1) |
and
S(˜Dβ1β1)∈S|β1|−,S(˜Dβ3β3)∈S|β3|+andtrace(˜Dββ)=0. | (3.2) |
For any H∈Cθ(P,Q), from part (ⅰ) of Lemma 2.1, ∃τ∈ℜ s.t.
λ|β1|(S(˜Hβ1β1))≥τ≥λ1(S(˜Hβ3β3)) |
and
S(˜Hββ)=[S(˜Hβ1β1)000τI|β2|000S(˜Hβ3β3)]. |
Thus, ∀H∈Cθ(P,Q) and D satisfies (3.1) and (3.2), according to Fan's inequality (2.8), we have that
⟨H,D⟩=⟨˜H,˜D⟩=⟨˜Hββ,˜Dββ⟩=⟨˜Hββ,S(˜Dββ)⟩=⟨S(˜Hββ),˜Dββ⟩=⟨S(˜Hβ1β1),˜Dβ1β1⟩+τtrace(˜Dβ2β2)+⟨S(˜Hβ3β3),˜Dβ3β3⟩≤(λ(S(˜Hβ1β1)))Tλ(˜Dβ1β1)+τtrace(˜Dβ2β2)+(λ(S(˜Hβ3β3)))Tλ(˜Dβ3β3)=(−λ(S(˜Hβ1β1)))T(−λ(˜Dβ1β1))+τtrace(˜Dβ2β2)+(λ(S(˜Hβ3β3)))Tλ(˜Dβ3β3)≤τtrace(˜Dβ1β1)+τtrace(˜Dβ2β2)+τtrace(˜Dβ3β3)=τtrace(˜Dββ)=0. |
This implies that D∈(Cθ(P,Q))∘. The proof of the part (ⅰ) is complete.
Case 2. σk(P)=0, ‖Q‖∗=k. If ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P), according to part (ⅱ) of Lemma 2.2 and part (ⅱ) of Lemma 2.3, we have that
˜D=(000000˜Dβ1β1˜Dβ1β2˜Dβ1b˜Dβ1c0˜Dβ2β1˜Dβ2β2˜Dβ2b˜Dβ2c0˜Dbβ1˜Dbβ2˜Dbb˜Dbc) | (3.3) |
and
S(˜Dβ1β1)∈S|β1|−andtrace(˜Dβ1∪β2β1∪β2)+‖[˜Dbb˜Dbc]‖∗≤0. | (3.4) |
Therefore, ∀H∈Cθ(P,Q), based on part (ⅱ) of Lemma 2.1, we know that there exists τ≥0 such that
λ|β1|(S(˜Hβ1β1))≥τ≥σ1([˜Hbb˜Hbc]) |
and
[˜Hββ˜Hβc]=[S(˜Hβ1β1)0000τI|β2|0000˜Hbb˜Hbc]. |
Hence, ∀H∈Cθ(P,Q) and D satisfies (3.3) and (3.4), by von Neumann's trace inequality (2.7), we have that
⟨H,D⟩=⟨˜H,˜D⟩=⟨˜Hββ,˜Dββ⟩+⟨˜Hβc,˜Dβc⟩=⟨S(˜Hβ1β1),S(˜Dβ1β1)⟩+τtrace(˜Dβ2β2)+⟨[˜Hbb˜Hbc],[˜Dbb˜Dbc]⟩≤(λ(S(˜Hβ1β1)))Tλ(S(˜Dβ1β1))+τtrace(˜Dβ2β2)+(σ([˜Hbb˜Hbc]))Tσ([˜Dbb˜Dbc])=(−λ(S(˜Hβ1β1)))T(−λ(S(˜Dβ1β1)))+τtrace(˜Dβ2β2)+(σ([˜Hbb˜Hbc]))Tσ([˜Dbb˜Dbc])≤τtrace(˜Dβ1β1)+τtrace(˜Dβ2β2)+τ‖[˜Dbb˜Dbc]‖∗≤0. |
This means that D∈(Cθ(P,Q))∘. The proof of the part (ⅱ) is complete.
Case 3. σk(P)=0, ‖Q‖∗<k. According to part (ⅲ) of Lemma 2.2 and part (ⅱ) of Lemma 2.3, we know that ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P) hold if and only if S(˜Dβ1β1)∈S|β1|− and
˜D=(000000˜Dβ1β1˜Dβ1β2˜Dβ1b˜Dβ1c0˜Dβ2β1˜Dβ2β2˜Dβ2b˜Dβ2c0˜Dbβ1˜Dbβ2˜Dbb˜Dbc). |
In view of part (ⅲ) of Lemma 2.1, this is equivalent to D∈(Cθ(P,Q))∘. This completes the proof.
Remark 3.1. The conclusions in the converse directions of (i) and (ii) in Proposition 3.1 are not always valid. That is because when the parameter τ within Cθ(P,Q) equals 0, we are unable to obtain the conditions such as trace(˜Dββ)=0 or trace(˜Dβ1∪β2β1∪β2)+‖[˜Dbb˜Dbc]‖∗≤0. However, there is a situation where the conditions ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P) are equivalent to D∈(Cθ(P,Q))∘ in Proposition 3.1 when the index set β2 is empty. In particular, when k=m, that is, when θ is the nuclear norm function, Cui and Sun [23, Proposition 4.2] have already presented this equivalent property.
By exchanging the positions of Cθ(P,Q) and Cθ∗(Q,P) in the above proposition, we can obtain the following similar yet enhanced results.
Proposition 3.2. Assume that W∈Rm×n has the SVD as in (2.2). Denote P=Proxθ(W) and Q=Proxθ∗(W). Suppose the index sets α,β,γ,β1,β2,β3,b,c are defined by (2.3), (2.4) or (2.5), and (2.6). For any D∈Rm×n, denote ˜D=UTDV. Then the subsequent two conditions are equivalent:
(i) ψ∗(P,D)(Q)=0 and D∈Cθ(P,Q).
(ii) D∈(Cθ∗(Q,P))∘.
Proof. Case 1. σk(P)>0. "(i)⟹(ii)". It is easy to verify ψ∗(P,D)(Q)=0 and D∈Cθ(P,Q) are equivalent to
˜D=(S(˜D1)ααS(˜D1)αβ10000S(˜D1)β1αS(˜D1)β1β1000000τI|β2|000000˜Dβ3β3˜Dβ3γ˜Dβ3c000˜Dγβ3˜Dγγ˜Dγc), | (3.5) |
where τ is some real number that satisfies
λ|β1|(S(˜Dβ1β1))≥τ≥λ1(S(˜Dβ3β3)). | (3.6) |
Hence, for any D that satisfies (3.5) and (3.6), ∀H∈Cθ∗(Q,P), i.e., trace(˜Hββ)=0,
S(˜Hα∪β1α∪β1)=[000S(˜Hβ1β1)] with S(˜Hβ1β1)∈S|β1|− |
and
[˜Hβ3β3˜Hβ3γ˜Hβ3c˜Hγβ3˜Hγγ˜Hγc]=[S(˜Hβ3β3)00000] with S(˜Hβ3β3)∈S|β3|+, |
we have that
⟨H,D⟩=⟨˜H,˜D⟩=⟨˜Hβ1β1,S(˜Dβ1β1)⟩+⟨˜Hβ2β2,τI|β2|⟩+⟨S(˜Hβ3β3),˜Dβ3β3⟩=⟨˜H,˜D⟩=⟨˜Hβ1β1,S(˜Dβ1β1)⟩+⟨˜Hβ2β2,τI|β2|⟩+⟨˜Hβ3β3,S(˜Dβ3β3)⟩≤(λ(S(˜Dβ1β1)))Tλ(˜Hβ1β1)+τtrace(˜Hβ2β2)+(λ(S(˜Dβ3β3)))Tλ(˜Hβ3β3)=(−λ(S(˜Dβ1β1)))T(−λ(˜Hβ1β1))+τtrace(˜Hβ2β2)+(λ(S(˜Dβ3β3)))Tλ(˜Hβ3β3)≤τtrace(˜Hβ1β1)+τtrace(˜Hβ2β2)+τtrace(˜Hβ3β3)=τtrace(˜Hββ)=0. |
This means that D∈(Cθ∗(Q,P))∘. Through reversing the prior arguments, it is possible to prove the converse of this statement.
Case 2. σk(P)=0, |Q‖∗=k. Obviously, both (i) and (ii) are equivalent to
˜D=(S(˜D1)ααS(˜D1)αβ1000S(˜D1)β1αS(˜Dβ1β1)00000τI|β2|00000˜Dbb˜Dbc), |
where τ≥0 and satisfies
λ|β1|(S(˜Dβ1β1))≥τ≥σ1([˜Dbb˜Dbc]). |
This result can be obtained in a similar way as in part (ⅱ) of proposition 3.1. For simplicity, the detailed proof is omitted here.
Case 3. σk(P)=0, |Q‖∗<k. According to part (ⅲ) of Lemma 2.1, part (ⅲ) of Lemma 2.2, and part (ⅱ) of Lemma 2.3, we obtain that either ψ∗(P,D)(Q)=0 with D∈Cθ(P,Q) or D∈(Cθ∗(Q,P))∘ is equivalent to S(˜Dβ1β1)∈S|β1|+ and
˜D=(S(˜D1)ααS(˜D1)αβ1000S(˜D1)β1αS(˜Dβ1β1)0000000000000). |
The proof is complete.
In this section, from the dual perspective, we will establish several equivalent characterizations of the robust ICKKTM for problem (1.1). The key tools we use are Propositions 3.1 and 3.2. Indeed, when k=m, the similar results have already been provided in [23]. As a result, we will only make a brief statement here.
The Lagrangian dual of problem (1.1) is presented below:
maxy,w,S−⟨b,y⟩−δP∘(y)−h∗(w),s.t.B∗y+A∗w+S+C=0,‖S‖∗(k)≤1, | (4.1) |
where (y,w,S)∈Rl×Rd×Rm×n, B∗ and A∗ represent the adjoint of B and A, respectively. The KKT conditions for problem (1.1) and problem (4.1) are given by
{A∗∇h(AX)+C+S+B∗y=0,P∘∋y⊥BX−b∈P,S∈∂θ(X),(X,y,S)∈Rm×n×Rl×Rm×n | (4.2) |
and
{B∗y+A∗w+S+C=0,P∘∋y⊥BX−b∈P,AX∈∂h∗(w),X∈∂θ∗(S),(y,w,S,X)∈Rl×Rd×Rm×n×Rm×n, | (4.3) |
respectively. We use MP(X)⊆Rl×Rm×n to represent the set of Lagrangian multipliers with respect to X for problem (1.1), namely, MP(X):={(y,S)∈Rl×Rm×n∣(X,y,S)satisfies(4.2)}. We denote by MD(y,w,S) the set of Lagrangian multipliers with respect to (y,w,S) for problem (4.1), i.e., MD(y,w,S):={X∈Rm×n∣(y,w,S,X)satisfies(4.3)}. Let (y,z)∈Rl×Rl satisfy P∘∋y⊥z∈P. Define the critical cone of P at z for y as CP(z,y):=TP(z)∩y⊥ and the critical cone of P∘ at y for z as CP∘(y,z):=TP∘(y)∩z⊥, respectively.
Let ¯X∈Rm×n be an optimal solution of problem (1.1) and (ˉy,¯S)∈MP(¯X). The primal SOSC is said to hold at ¯X with respect to (ˉy,¯S) if
⟨AD,∇2h(A¯X)AD⟩−ψ∗(¯X,D)(¯S)>0,∀D∈C(¯X)∖{0}, | (4.4) |
where C(¯X):=CP(B¯X−b,ˉy)∩Cθ(¯X,¯S). We say that the primal SRCQ holds at ¯X with respect to (ˉy,¯S), if
(BIRm×n)Rm×n+(CP(B¯X−b,ˉy)Cθ(¯X,¯S))=(RlRm×n). | (4.5) |
Similarly, let (ˉy,ˉw,¯S)∈Rl×Rd×Rm×n be an optimal solution of problem (4.1) and ¯X∈MD(ˉy,ˉw,¯S).
We say that the dual SOSC holds at (ˉy,ˉw,¯S) with respect to ¯X, if
⟨Hw,(∇h∗)′(ˉw;Hw)⟩−φ∗(¯S,HS)(¯X)>0,∀(Hy,Hw,HS)∈C(ˉy,ˉw,¯S)∖{0}, | (4.6) |
where C(ˉy,ˉw,¯S) is the critical cone defined as
C(ˉy,ˉw,¯S):={(Dy,Dw,DS)∈Rl×Rd×Rm×n∣B∗Dy+A∗Dw+DS=0,Dy∈CP∘(ˉy,B¯X−b),DS∈Cθ∗(¯S,¯X)}. |
The dual SRCQ is said to hold at (ˉy,ˉw,¯S) with respect to ¯X if
A∗Rd+B∗CP∘(ˉy,B¯X−b)+Cθ∗(¯S,¯X)=Rm×n. | (4.7) |
The KKT conditions (4.2) for problem (1.1) can be equivalently expressed as the nonsmooth equation
G(X,y,S)=0, |
where G:Rm×n×Rl×Rm×n→Rm×n×Rl×Rm×n is defined by
G(X,y,S):=[A∗∇h(AX)+C+S+B∗yBX−b−ΠP(BX−b+y)X−Proxθ(X+S)],(X,y,S)∈Rm×n×Rl×Rm×n. |
Denote the KKT mapping for problem (1.1) by
SKKT(δ):={(X,y,S)∈Rm×n×Rl×Rm×n∣G(X,y,S)=δ}. |
The Robinson constraint qualification (RCQ) is said to hold at ¯X∈Rm×n for problem (1.1) if
BRm×n+TP(B¯X−b)=Rl. | (4.8) |
At this point, we are fully prepared to present the most crucial conclusion within this paper, namely, a series of equivalent characterizations of the ICKKTM for problem (1.1).
Theorem 4.1. Let ¯X∈Rm×n and (ˉy,ˉw,¯S)∈Rl×Rd×Rm×n be optimal solutions of problem (1.1) and (4.1), respectively. Assume that (ˉy,¯S)∈MP(¯X), ¯X∈MD(ˉy,ˉw,¯S) and the RCQ (4.8) holds at ¯X. Then the subsequent conditions are equivalent:
(a) The KKT mapping SKKT is robustly isolated calm at the origin for (¯X,ˉy,¯S).
(b) The primal SOSC (4.4) holds at ¯X for (ˉy,¯S) and the primal SRCQ (4.5) holds at ¯X for (ˉy,¯S).
(c) The primal SOSC (4.4) holds at ¯X for (ˉy,¯S) and the dual SOSC (4.6) holds at (ˉy,ˉw,¯S) for ¯X.
(d) The dual SRCQ (4.7) holds at (ˉy,ˉw,¯S) for ¯X and the dual SOSC (4.6) holds at (ˉy,ˉw,¯S) for ¯X.
(e) The dual SRCQ (4.7) holds at (ˉy,ˉw,¯S) for ¯X and the primal SRCQ (4.5) holds at ¯X for (ˉy,¯S).
Proof. Since the Ky Fan k-norm function ‖⋅‖(k) is C2-cone reducible [15, Proposition 4.3], "(a)⟺(b)" can be directly derived from [23, Proposition 3.3]. Besides, by Proposition 3.1 and [31, Theorem 4.1], we can obtain that the primal SRCQ (4.5) and the dual SOSC (4.6) are equivalent. Then we have "(b)⟺(c)". Similarly, the equivalence of the primal SOSC (4.4) and the dual SRCQ (4.7) can also be derived from Proposition 3.2 and [23, Theorem 5.1]. It follows that "(b)⟺(e)" and "(c)⟺(d)". In summary, we can conclude that "(a)⟺(b)⟺(c)⟺(d)⟺(e)".
Remark 4.1. Although the conditions ϕ∗(Q,D)(P)=0 and D∈Cθ∗(Q,P) in Proposition 3.1 are not always equivalent to D∈(Cθ(P,Q))∘, this does not affect the equivalence between the primal SRCQ (4.5) and the dual SOSC (4.6) in the proof of the above theorem. For further details, one may refer to [31, Theorem 4.1].
This paper focuses on establishing a more comprehensive set of equivalent conditions of the robust ICKKTM for Ky Fan k-norm regularized convex matrix optimization problems. In other words, we extend the results for nuclear norm regularized convex optimization problems in [23] to Ky Fan k-norm regularized convex optimization problems. Among the newly obtained equivalent conditions, the SRCQ is more easily verified than the SOSC. Therefore, the results presented in this paper not only enrich the stability analysis theory of the Ky Fan k-norm regularized convex optimization problems but also enhance the feasibility of designing related algorithms. The most crucial point is the establishment of Propositions 3.1 and 3.2. Therefore, exploring what kind of θ, that is, seeking the axiomatic properties of θ satisfying the results in Propositions 3.1 and 3.2, is a topic of our future interest. Additionally, integrating these theoretical results into algorithm design to efficiently solve optimization problems remains one of our future research directions.
Ziran Yin: Conceptualization, investigation, writing-review and editing, funding acquisition; Chongyang Liu: Methodology and supervision; Xiaoyu Chen: Original draft preparation; Jihong Zhang: Writing-review and editing; Jinlong Yuan: Validation, writing-review and editing.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
This work was supported by the National Natural Science Foundation of China [grant numbers 12101101, 12271307]; the China Postdoctoral Science Foundation [grant number 2022M720631]; the Fundamental Research Funds for the Central Universities; the Shandong Natural Science Foundation, China [grant number ZR2023MA054]; the Nature Science Foundation of Liaoning Province of China [grant number 2024-MS-015].
All authors declare no conflicts of interest in this paper.
[1] |
K. C. Toh, L. N. Trefethen, The Chebyshev polynomials of a matrix, SIAM J. Matrix Anal. A., 20 (1998), 400–419. https://doi.org/10.1137/S0895479896303739 doi: 10.1137/S0895479896303739
![]() |
[2] |
P. Apkarian, D. Noll, Nonsmooth H∞ synthesis, IEEE T. Automat. Contr., 51 (2006), 71–86. https://doi.org/10.1109/TAC.2005.860290 doi: 10.1109/TAC.2005.860290
![]() |
[3] |
V. Bompart, D. Noll, P. Apkarian, Second-order nonsmooth optimization for H∞ synthesis, Numer. Math., 107 (2007), 433–454. https://doi.org/10.1007/s00211-007-0095-9 doi: 10.1007/s00211-007-0095-9
![]() |
[4] | J. L. Yuan, D. Y. Yang, D. Y. Xun, K. L. Teo, C. Z. Wu, A. Li, et al., Sparse optimal control of cyber-physical systems via PQA approach, Pac. J. Optim., 2024. |
[5] | A. Johansson, N. Engsner, C. Strannegård, P. Mostad, Improved spectral norm regularization for neural networks, In: Modeling Decisions for Artificial Intelligence, Cham: Springer, 2023,181–201. https://doi.org/10.1007/978-3-031-33498-6_13 |
[6] | R. D. Gao, W. K. Yang, X. Sun, Defying forgetting in continual relation extraction via batch spectral norm regularization, In: 2024 International Joint Conference on Neural Networks (IJCNN), 2024, 1–8. https://doi.org/10.1109/IJCNN60899.2024.10651110 |
[7] |
E. J. Candès, B. Recht, Exact matrix completion via convex optimization, Found. Comput. Math., 9 (2009), 717–772. https://doi.org/10.1007/s10208-009-9045-5 doi: 10.1007/s10208-009-9045-5
![]() |
[8] |
B. Recht, M. Fazel, P. A. Parrilo, Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization, SIAM Rev., 52 (2010), 471–501. https://doi.org/10.1137/070697835 doi: 10.1137/070697835
![]() |
[9] |
G. A. Watson, On matrix approximation problems with Ky Fan k norms, Numer. Algor., 5 (1993), 263–272. https://doi.org/10.1007/BF02210386 doi: 10.1007/BF02210386
![]() |
[10] |
S. Boyd, P. diaconis, P. A. Parrilo, L. Xiao, Fastest mixing Markov chain on graphs with symmetries, SIAM J. Optim., 20 (2009), 792–819. https://doi.org/10.1137/070689413 doi: 10.1137/070689413
![]() |
[11] |
Z. K. Yao, F. Y. Xu, G. P. Jiang, J. Y. Yao, Data-driven control of hydraulic manipulators by reinforcement learning, IEEE-ASME T. Mech., 29 (2024), 2673–2684. https://doi.org/10.1109/TMECH.2023.3336070 doi: 10.1109/TMECH.2023.3336070
![]() |
[12] |
X. T. Yang, Q. X. Zhu, H. Wang, Exponential stabilization of stochastic systems via novel event-triggered switching controls, IEEE T. Automat. Contr., 69 (2024), 7948–7955. https://doi.org/10.1109/TAC.2024.3406668 doi: 10.1109/TAC.2024.3406668
![]() |
[13] |
Q. X. Zhu, Event-triggered sampling problem for exponential stability of stochastic nonlinear delay systems driven by Lévy processes, IEEE T. Automat. Contr., 70 (2025), 1176–1183. https://doi.org/10.1109/TAC.2024.3448128 doi: 10.1109/TAC.2024.3448128
![]() |
[14] |
C. Ding, Variational analysis of the Ky Fan k-norm, Set-Valued Var. Anal., 25 (2017), 265–296. https://doi.org/10.1007/s11228-016-0378-3 doi: 10.1007/s11228-016-0378-3
![]() |
[15] | C. Ding, An introduction to a class of matrix optimization problems, Singapore: National University of Singapore, 2012. |
[16] | Y. L. Liu, S. H. Pan, Locally upper Lipschitz of the perturbed KKT system of Ky Fan k-norm matrix conic optimization problems, 2015. https://doi.org/10.48550/arXiv.1509.00681 |
[17] |
D. R. Han, D. F. Sun, L. W. Zhang, Linear rate convergence of the alternating direction method of multipliers for convex composite programming, Math. Oper. Res., 43 (2018), 622–637. https://doi.org/10.1287/moor.2017.0875 doi: 10.1287/moor.2017.0875
![]() |
[18] |
R. T. Rockafellar, Augmented Lagrangians and applications of the proximal point algorithm in convex programming, Math. Oper. Res., 1 (1976), 97–116. https://doi.org/10.1287/moor.1.2.97 doi: 10.1287/moor.1.2.97
![]() |
[19] |
Y. L. Zhang, L. W. Zhang, On the upper Lipschitz property of the KKT mapping for nonlinear semidefinite optimization, Oper. Res. Lett., 44 (2016), 474–478. https://doi.org/10.1016/j.orl.2016.04.012 doi: 10.1016/j.orl.2016.04.012
![]() |
[20] |
Y. Zhang, L. W. Zhang, J. Wu, K. Wang, Characterizations of local upper Lipschitz property of perturbed solutions to nonlinear second-order cone programs, Optimization, 66 (2017), 1079–1103. https://doi.org/10.1080/02331934.2017.1325887 doi: 10.1080/02331934.2017.1325887
![]() |
[21] |
C. Ding, D. F. Sun, L. W. Zhang, Characterization of the robust isolated calmness for a class of conic programming problems, SIAM J. Optim., 5 (2017), 67–90. https://doi.org/10.1137/16M1058753 doi: 10.1137/16M1058753
![]() |
[22] | D. R. Han, D. F. Sun, L. W. Zhang, Linear rate convergence of the alternating direction method of multipliers for convex composite quadratic and semi-definite programming, 2015. https://doi.org/10.48550/arXiv.1508.02134 |
[23] |
Y. Cui, D. F. Sun, A complete characterization of the robust isolated calmness of nuclear norm regularized convex optimization problems, J. Comput. Math., 36 (2018), 441–458. https://doi.org/10.4208/jcm.1709-m2017-0034 doi: 10.4208/jcm.1709-m2017-0034
![]() |
[24] | R. T. Rockafellar, R. J. B. Wets, Variational analysis, Berlin: Heidelberg, 1998. https://doi.org/10.1007/978-3-642-02431-3 |
[25] | J. F. Bonnans, A. Shapiro, Perturbation analysis of optimization problems, New York: Springer, 2000. https://doi.org/10.1007/978-1-4612-1394-9 |
[26] | R. T. Rockafellar, Convex analysis, Princeton university press, 1997. |
[27] | A. Beck, First-order methods in optimization, Society for Industrial and Applied Mathematics, 2017. |
[28] | J. von Neumann, Some matrix-inequalities and metrization of matric-space, 1937. |
[29] |
K. Fan, On a theorem of Weyl concerning eigenvalues of linear transformations I, P. Nat. Acad. Sci. USA, 35 (1949), 652–655. https://doi.org/10.1073/pnas.35.11.652 doi: 10.1073/pnas.35.11.652
![]() |
[30] |
L. W. Zhang, N. Zhang, X. T. Xiao, On the second-order directional derivatives of singular values of matrices and symmetric matrix-valued functions, Set-Valued Var., 21 (2013), 557–586. https://doi.org/557-586.10.1007/s11228-013-0237-4 doi: 10.1007/s11228-013-0237-4
![]() |
[31] | Z. R. Yin, X. Y. Chen, J. H. Zhang, The robust isolated calmness of spectral norm regularized convex matrix optimization problems, 2024. https://doi.org/10.48550/arXiv.2410.16697 |