1.
Introduction
Since Lim [1] and Qi [2] presented some theories of the eigenvalues and eigenvectors for higher order tensors, the related research has received much more attention (see [3,4,5,6,7,8,9,10,11,12,13], etc). The eigenvalues of symmetric tensors have been applied in blind source separation [2] and hypergraph theory [9], statistical data analysis [14] and high order Markov chains [15], etc. Moreover, various definitions of eigenvalues and eigenvectors for tensors have been introduced [2,10,16].
There are many works for computing the eigenvalues of tensors, especially for Z-eigenvalues. Qi et al. [17] proposed an elimination method for finding all Z-eigenvalues, which is specific to the third-order tensors. Kolda and Mayo [6] presented a shifted power method (SPM) for calculating Z-eigenvalues, in which the shifted parameter is crucial. Han [18] provided an unconstrained optimization approach for even order symmetric tensors. Hao, Cui and Dai [4] found the extreme Z-eigenvalues and corresponding Z-eigenvectors by the sequential subspace projection method. Under certain assumptions, the global convergence and linear convergence were established for symmetric tensors. Hao, Cui and Dai [19] proposed a feasible trust region method for finding the extreme Z-eigenvalues of symmetric tensors. The global convergence and local quadratic convergence were established.
Inspired by the idea of improved conjugate parameters proposed in the above works and the application of optimization methods in tensor eigenvalue calculation, our main work is to study the transformation of Z-eigenvalues of symmetric tensors into unconstrained optimization and to propose a new algorithm. The contributions of this article are listed as follows:
For the case of different critical point, we transform the Z-eigenvalues of symmetric tensors into different unconstrained optimization problems which include shifted problem.
We propose a new conjugate gradient method with a new conjugate gradient parameter and an accelerated parameter, which converges to a critical point. The found nonzero critical point is a Z-eigenvector associated with a Z-eigenvalue of symmetric tensors. When the zero critical point is obtained, a shifted problem is solved for finding a Z-eigenvalue.
The global convergence of new method is established. We compare our method with conjugate gradient methods proposed in [20,21], for computing the Z-eigenvalues of symmetric tensors. The numerical results show that the proposed method is competitive.
The rest of this paper is organized as follows. In Section 2, we transform the Z-eigenvalues problem into unconstrained optimizations, and propose an accelerated conjugate gradient method for solving it. Global convergence result is established in Section 3. Numerical experiments are shown in Section 4.
2.
New method for the Z-eigenvalues of symmetric tensors
Let R be the real field, m,n be positive integers and
be an mth-order n-dimensional real tensor. The set of mth-order n-dimensional real tensor is denoted by R[m,n]. Tensor A is symmetric if its entries are invariant under any permutation of their indices. The set of mth-order n-dimensional real symmetric tensor is denoted by S[m,n].
If A∈R[m,n], an mth-degree homogeneous polynomial function with real coefficients is uniquely determined by
For x=(x1,⋯,xn)T∈Rn, Axm−1 denotes a n-dimensional column vector, i.e.,
If A is symmetric, then the gradient of Axm satisfies
for all x∈Rn. In our work, we consider the following Z-eigenvalues of symmetric tensors.
Definition 1. [2] Let A∈R[m,n], if there exist λ∈R and a vector x∈Rn∖{0} satisfying
then λ is called a Z-eigenvalue of A and x is called the corresponding Z-eigenvector.
Motivated by the work of Auchmuty [22], we generalize the unconstrained variational principles to Z-eigenvalues of A. Consider the following unconstrained optimization
The gradient and Hessian of f(x) are listed as follows
where
Obviously, G(x) is a symmetric matrix. In order to research the properties of f(x) in (2.2), we cite the following definition and a nice feature of it.
Definition 2. [23] A continuous function h:Rn→R is called coercive if it satisfies
If x satisfies the equation ∇h(x)=0, then it is termed as a critical point of h(x).
Theorem 1. [23] Let h:Rn→R be continuous. If h is coercive, then h has at least one global minimizer. In addition, if the first partial derivatives exist on Rn, then h attains its global minimizers at its critical points.
Based on a similar argument, for the Z-eigenvalues of tensors, we have the following result.
Theorem 2. Let A∈R[m,n] be symmetric tensors. Assume that λmax is the largest Z-eigenvalue of A. Denote the Z-spectrum of A by σZ(A):={λ:λis a Z-eigenvalue ofA}. We have
(ⅰ) f(x) is coercive on Rn.
(ⅱ) The critical points of f(x) are at x=0 and any Z-eigenvector x≠0 associated with a Z-eigenvalue λ>0 of A satisfying λ=(xTx)m−1.
(ⅲ) If λmax>0, then f(x) attains its global minimal value
at any Z-eigenvector associated with the Z-eigenvalue λmax such that λmax=(xTx)m−1.
(ⅳ) If λmax≤0, then x=0 is the unique critical point of f(x). Moreover, it is the unique global minimizer of f(x) on Rn.
Proof. (ⅰ) Since
and Axm is an mth-degree homogeneous polynomial function with real coefficients, then
That is, f(x) is coercive on Rn.
(ⅱ) From the definition of the critical point of f(x), we have
It is obvious that x=0 is a critical point of f(x) as g(0)=0. The point x∈Rn∖{0} satisfying (2.5) is a Z-eigenvector corresponding to the Z-eigenvalue λ=(xTx)m−1>0, which is also a critical point of f(x).
(ⅲ) At the critical point x∈Rn∖{0}, a Z-eigenvector x associated with a Z-eigenvalue λ satisfies λ=(xTx)m−1 and Axm=λxTx. Moreover,
By Theorem 2.1 and conclusion (ⅱ), we get the global minimum value −12m(λmax)mm−1 at any Z-eigenvector x corresponding to the Z-eigenvalue λmax such that λmax=(xTx)m−1.
(ⅳ) Since λmax≤0 implies that λ≤0 for any λ∈σZ(A), λ=(xTx)m−1 does not hold for any Z-eigenvector x associated with a Z-eigenvalue λ, as (xTx)m−1>0 for any x∈Rn∖{0}. Therefore, from Theorem 2.1, x=0 is the unique critical point and the unique global minimize of f(x).
Note that, if λmax≤0, then x=0 is the unique critical point, but it does not result in a Z-eigenvalue. In this case, we solve a following shifted problem
where t>0 is a shifted parameter. It is obvious that, when t is sufficient large, for any x≠0, we have ft(x)<0. From ft(0)=0, we know that x=0 is the unique maximizer of ft(x). Denote the gradient of ft(x) as
Obviously, x=0 is also a critical point of ft(x). The nonzero critical point of problem (2.6) is a Z-eigenvector corresponding to Z-eigenvalue λt=(xTtxt)m−1−t. In this case, a suitable descent algorithm for solving shifted problem (2.6) should converge to a nonzero critical point. Therefore, we can get Z-eigenvalues and its associated Z-eigenvectors by solving problem (2.2) or (2.6). The algorithm is described as follows. □
Algorithm 1: Step 0: Given A∈S[m,n],t≥1,ˉρ>1.
Step 1: Solving problem (2.2) by using an algorithm to obtain xk and compute λk=(xTkxk)m−1.
Step 2: If ‖xk‖≤ε, stop, output xk and compute λk=(xTkxk)m−1−t; otherwise, let t:=ˉρt, go to Step 3.
Step 3: Solve problem (2.6) by using an algorithm to obtain xk, go to Step 2.
Remark 1. There is an inner loop between Step 2 and Step 3. Since descent algorithm for solving problem (2.6) will result in a nonzero critical point, then this inner loop can be terminated by finite iteration for sufficient large t. Therefore, Algorithm 1 is well defined.
Remark 2. When executing algorithm, it should use the same unconstrained optimization method to solve problems (2.2) and (2.6). We will propose a new accelerated conjugate gradient method, especially for solving problem (2.2) or (2.6).
3.
An accelerated conjugate gradient algorithm and its Convergence
In this section, we devote to giving an accelerated conjugate gradient method for solving unconstrained optimization problem, such as (2.2) or (2.6). Firstly, we consider the iterative formula of nonlinear conjugate gradient algorithm
The stepsize αk>0 is determined by a line search and the direction dk are computed by
where gk+1=g(xk+1),sk=αkdk.
We first introduce a new conjugate parameter of our conjugate gradient method based on CD nonlinear conjugate gradient method ([20]). The new conjugate parameter is
is introduced. When using the exact line search, βk+1 in (3.3) reduces to CD conjugate gradient parameter. An accelerated parameter θk+1 is obtained by the quasi-Newton direction ([24,25]). Let dk+1=−G−1k+1gk+1, namely
where Gk+1 satisfies the following secant equation
From (3.4), then
Pre-multiplying at the both sides by sTk, we have
Then, combined with (3.3)–(3.6), we have
The stepsize is generated by the strong Wolfe line search conditions
where 0<ρ<σ<1. We prove the descent property of the direction (3.2) under (3.8) and (3.9) in the following. Multiply both sides of (3.9) by αk, from sk=αkdk, we can easily obtain (gTk+1sk)2≤σ2(−gTksk)2.
Theorem 3. If θk+1≥12+2σ2, then the direction determined by (3.2) satisfies the sufficient descent condition
Proof. Multiplying (3.2) by gTk+1, we obtain
Using the inequality aTb≤12(‖a‖2+‖b‖2), where a,b∈Rn, we have
Substituted (3.12) into (3.11), we have
Then from (gTk+1sk)2≤σ2(−gTksk)2 and θk+1≥12+2σ2, we have
The proof is completed. □
Now, we describe an accelerated conjugate gradient algorithm (ACG).
ACG algorithm
Step 0: Given x0∈Rn,ε≥0and0<ρ<σ<1. Compute g0, let d0=−g0. Set k:=0.
Step 1: If ‖gk‖≤ε, stop, output xk. Otherwise, calculate αk from (3.8) and (3.9). Let xk+1=xk+αkdk and sk=αkdk.
Step 2: Compute gk+1,yk, βk+1 by (3.3) and θk+1 by (3.7). Let θk+1:=max{θk+1,2σ2+12}.
Step 3: Using (3.2) to obtain dk+1. Set k:=k+1 and go to step 1.
Now, we establish the convergence result of ACG algorithm. Let the level set Ω={x∈Rn|f(x)≤f(x0)} be a bounded closed set, i.e., there exists a constant γ>0 such that ‖x‖≤γ for all x∈Ω. To facilitate analyzing, denote Ai1=(ai1i2⋯im)1≤i2,i3,⋯,im≤n and Ai1i2=(ai1i2⋯im)1≤i3,i4,⋯,im≤n.
Lemma 1. Consider tensors A∈S[m,n], then Axm is Lipschitz continuous on Ω.
Proof. Let p(x)=Axm, mathematical induction is adopted. If m=1, p(x)=n∑i=1aixi, utilizing equivalence of vector norm, for all x, y∈Ω, we have
Assume that the statements satisfy for all order m≤k−1. When m=k, we have
Since Ω is a bounded close set, then ‖Ai1xk−1‖ is bounded on Ω and ‖z‖≤P1. From n∑i1=1|xi1−yi1|≤‖x−y‖ and n∑i1=1|yi1|≤‖z‖, there exists a positive constant P3 such that
Namely, Axm is Lipschitz continuous on Ω. The proof is completed. □
Lemma 2. If A∈S[m,n], then Axm−1 and Axm−2 are Lipschitz continuous on Ω.
Proof. For all x, y∈Ω, using Lemma 3.1 and equivalence of norm, we have
where P4 depends on tensor A and set Ω.
Similarly, for all x, y∈Ω, using Lemma 3.1 and equivalence of norm, we have
where P5 depends on tensor A and set Ω.
□
Lemma 3. If A∈S[m,n], then g(x) is Lipschitz continuous in a neighbourhood N of Ω, namely
holds for any x,y∈N, where L is a positive number.
Proof. There are two cases of gradient g(x) to consider. One case is computed by (2.3) and the other case is computed by (2.7).
For (2.3): since N is a bound closed set, from Lemma 3.1, for all x,y∈N, we have
For (2.7): since N is a bound closed set, from Lemma 3.1, for all x,y∈N, we have
The proof is completed. □
We can easily get that there exists a constant M>0 such that ‖g(x)‖≤M. The following useful lemma was essentially which proved by Zoutendijk [26]. From Theorem 3.1, we can obtain that the sequence {dk} generated by ACG algorithm satisfies the following Lemmas.
Lemma 4. Let the sequences {xk} and {dk} be generated by ACG algorithm, we have
Lemma 5. Let the sequence {xk} be generated by ACG algorithm. If
then
From Theorem 3.1, (3.17) and Lemma 3.4, the result can be proved, which is omitted here.
Theorem 4. Let the sequence {xk} be generated by ACG algorithm. Then, we have
Proof. From Theorem 3.1, there exists a constant c<0 satisfying gTksk≤c‖gk‖‖sk‖<0, i.e., −gTksk≥−c‖gk‖‖sk‖. Then, we have
where ξ=M(1+σ)−c. According to |gTk+1dk|≤−σgTkdk and ‖sk‖≤2γ, we have
Because of θk+1≥2σ2+12, so yTkgk+1>0. Without losing generality, let yTkgk+1>κ>0, then we have
Therefore,
We have
From Lemma 3.5, it follows that (3.20) is derived. The proof is completed. □
Theorem 5. Let problems (2.2) and (2.6) are solved by ACG algorithm. ACG algorithm is well defined.
Proof. In ACG algorithm, we obtain the Z-eigenvalues of symmetric tensors by solving problem (2.2) or (2.6). When problem (2.6) is solved, it means that solving problem (2.2) results in a zero critical point. Then ACG algorithm turn to solve problem (2.6) which converges to a nonzero critical point. Moreover, since the convergence of our algorithm has been guaranteed, so the termination criteria condition always holds. That is, ACG algorithm is well defined. The proof is completed. □
4.
Numerical experiments
In this section, we report some numerical performance of ACG algorithm for solving problems (2.2) and (2.6). For convenience, we provide a table of abbreviations for the methods in Table 1.
We compare ACG with HS and PRP, which have been reported to be very efficient for unconstrained optimization. All experiments are done on a PC with CPU 2.40GHz and 2.00GB RAM using MATLAB R2013a. In the implementation of ACG algorithm, we set parameters ε=10−5,ρ=0.1,σ=0.5,t=1,ˉρ=2. In Table 2, Ex is the number of example, n is the dimension, k is the number of iterations, CPU stands for the time costed by algorithms (in seconds), λ∗ stands for Z-eigenvalue outputted by algorithms. All algorithms share the same start points and stopping criteria. In the following examples, the tensors A are originally from [27].
Example 1. Let A∈S[3,n] defined by
Example 2. Let A∈S[3,n] defined by
Example 3. Let A∈S[4,n] defined by
Example 4. Let A∈S[4,3] defined by
Example 5. Let A∈S[4,n] defined by
Example 6. Let A∈S[4,n] defined by
In Tables 2 and 3, we compare the numerical results of ACG algorithm and SPM algorithm (in [6]), PRP, HS and CD methods. Although the computed Z-eigenvectors x∗ associated to λ∗ are not shown, the values of ‖gk‖ are listed which are satisfy the given precision. It implies that (λ∗,x∗) are considered as true solutions of problem (2.1). Two methods reach the same Z-eigenvalues for problems with same dimensions. ACG algorithm requires less iterations and CPU time than that of SPM algorithm. That is, we can see that ACG algorithm is competitive for computing Z-eigenvalues of symmetric tensors.
To show the numerical performance of a given optimal method, that the number of iterations (k) and CPU time (CPU) are important factors. So, we employ the profiles introduced by Dolan and Morˊe [28] to analyze the efficiency of ACG, PRP, HS [20], QN [29] ATTCG and ADL[30,31] methods, with the following conjugate gradient parameters, respectively,
Let Y and W be the set of methods and test problems, ny, nw be the number of methods and test problems, respectively. The performance profile ψ:R→[0,1] is for each y∈Y and w∈W defined that aw,y>0 is k or CPU required to solve problems w by method y. Furthermore, the performance profile is obtained by
where τ>0, size{⋅} is the number of the elements in a set, and rw,y is the performance ratio defined as
In a performance profile plot, the top curve is a method that solved most problems in a time that is within a factor of best time. The horizontal axis gives the percentage (τ) of the test problems for which a method is the fastest (efficiency), while the vertical side gives the percentage (ψ) of the test problems that are successfully solved by each of the methods. If program runs failure, or the number of iterations can reach more than 500, it is regarded as failed. And we denote the number of iterations by 500 and CPU time by 200 seconds. In this way, only ACG algorithm can solve all test problems.
As can be seen from Figure 1 shows the CPU time performance of the ACG algorithm and the other algorithms. It can be seen from the figure that when τ>3, the curves of the ACG algorithm and the ATTCG algorithm are similar, but when τ>3.5, both the ACG algorithm and ADL algorithm tend to be stable and coincide. Figure 2, the ACG algorithm is better than other algorithms in terms of the number of iterations, especially when τ>2, the curve of ACG algorithm becomes stable, which indicates that ACG algorithm can solve the problem only with fewer iterations. Therefore, Figures 1 and 2 show that the ACG algorithm proposed in this paper converge to the solution quickly.
5.
Conclusions
We constructed the unconstrained optimization problems with a shifted parameter. Based on the shifted unconstrained optimization problems, we presented an accelerated conjugate gradient method by using the quasi-Newton direction for solving them. Furthermore, we showed the global convergence analysis of the proposed algorithm. Numerical experiments demonstrated that our method has good numerical performance. We further highlight that the proposed algorithm can be used in other fields, such as the symmetric system of nonlinear equations. It is vital to note that some new methods with random technology will be taken into account in our future work.
Acknowledgments
National Natural Science Foundation of China (12061087), the Scientific and Technological Developing Scheme of Jilin Province (YDZJ202101ZYTS167, YDZJ202201ZYTS303), the Yunnan Provincial Ten Thousands Plan Young Top Talents, the Project of education department of Jilin Province (JJKH20210030KJ).
Conflict of interest
The authors declare no conflicts of interest.