Name | n | nnz | numd | den |
wb-cs-stanford | 9,914 | 36,854 | 2,861 | 0.375×10−1 |
web-Stanford | 281903 | 2,312,497 | 172 | 0.291×10−2 |
wikipedia-20051105 | 1,634,989 | 19,753,078 | 72,556 | 0.739×10−3 |
Citation: Chun Wen, Qian-Ying Hu, Bing-Yuan Pu, Yu-Yun Huang. Acceleration of an adaptive generalized Arnoldi method for computing PageRank[J]. AIMS Mathematics, 2021, 6(1): 893-907. doi: 10.3934/math.2021053
[1] | Shuting Tang, Xiuqin Deng, Rui Zhan . The general tensor regular splitting iterative method for multilinear PageRank problem. AIMS Mathematics, 2024, 9(1): 1443-1471. doi: 10.3934/math.2024071 |
[2] | Zhao-Li Shen, Yu-Tong Liu, Bruno Carpentieri, Chun Wen, Jian-Jun Wang . Recursive reordering and elimination method for efficient computation of PageRank problems. AIMS Mathematics, 2023, 8(10): 25104-25130. doi: 10.3934/math.20231282 |
[3] | Jin-Song Xiong . Generalized accelerated AOR splitting iterative method for generalized saddle point problems. AIMS Mathematics, 2022, 7(5): 7625-7641. doi: 10.3934/math.2022428 |
[4] | Cuijie Zhang, Zhaoyang Chu . New extrapolation projection contraction algorithms based on the golden ratio for pseudo-monotone variational inequalities. AIMS Mathematics, 2023, 8(10): 23291-23312. doi: 10.3934/math.20231184 |
[5] | Duangdaw Rakjarungkiat, Nimit Nimana . An extrapolated fixed-point optimization method for strongly convex smooth optimizations. AIMS Mathematics, 2024, 9(2): 4259-4280. doi: 10.3934/math.2024210 |
[6] | James Abah Ugboh, Joseph Oboyi, Hossam A. Nabwey, Christiana Friday Igiri, Francis Akutsah, Ojen Kumar Narain . Double inertial extrapolations method for solving split generalized equilibrium, fixed point and variational inequity problems. AIMS Mathematics, 2024, 9(4): 10416-10445. doi: 10.3934/math.2024509 |
[7] | Yali Zhao, Qixin Dong, Xiaoqing Huang . A self-adaptive viscosity-type inertial algorithm for common solutions of generalized split variational inclusion and paramonotone equilibrium problem. AIMS Mathematics, 2025, 10(2): 4504-4523. doi: 10.3934/math.2025208 |
[8] | Yang Cao, Quan Shi, Sen-Lai Zhu . A relaxed generalized Newton iteration method for generalized absolute value equations. AIMS Mathematics, 2021, 6(2): 1258-1275. doi: 10.3934/math.2021078 |
[9] | Dingyu Zhu, Yueting Yang, Mingyuan Cao . An accelerated adaptive two-step Levenberg–Marquardt method with the modified Metropolis criterion. AIMS Mathematics, 2024, 9(9): 24610-24635. doi: 10.3934/math.20241199 |
[10] | Saulo Orizaga, Ogochukwu Ifeacho, Sampson Owusu . On an efficient numerical procedure for the Functionalized Cahn-Hilliard equation. AIMS Mathematics, 2024, 9(8): 20773-20792. doi: 10.3934/math.20241010 |
Using the hyperlink structure of web pages, Google's PageRank becomes one of the most successful methods for measuring the importance of each page [1]. From the viewpoint of numerical computations, the core of PageRank problems can be regarded as the problem of solving a dominant eigenvector of the Google matrix A:
Ax=x,A=αP+(1−α)veT,‖x‖1=1, | (1.1) |
where x∈Rn is the PageRank vector, α∈(0,1) is a damping factor, e=[1,1,⋯,1]T∈Rn, v=e/n, P∈Rn×n is a column-stochastic matrix, see [2] for details.
As an iterative method based on matrix-vector products, the power method is widely used for computing PageRank [1,3]. However, when the damping factor α is close to 1, the power method suffers from slow convergence such that some accelerated techniques are developed. For example, based on the inner-outer iteration method proposed by Gleich et al. [4], Tian et al. [5] developed a general inner-outer iteration method for solving PageRank problems. Using the trace of the Google matrix A, Tan [6] introduced an extrapolation strategy and presented the power method with extrapolation process based on trace (PET) for improving the computation of PageRank problems.
On the other hand, Krylov subspace methods based on the Arnoldi process have been applied to compute PageRank problems. Golub and Greif [7] proposed an Arnoldi-type method by using the singular value decomposition (SVD), where the known largest eigenvalue 1 is considered as a shift such that the computation of the largest Ritz value is avoided. Wu and Wei [8] developed a Power-Arnoldi algorithm by periodically combining the power method with the thick restarted Arnoldi algorithm [9]. Hu et al. [10] proposed a variant of the Power-Arnoldi algorithm by employing the PET method.
Recently, the idea of introducing weighted inner products into an Arnoldi process has successfully been applied to many academic fields [11,12]. Yin et al. [13] proposed an adaptive generalized Arnoldi (GArnoldi) method for computing PageRank by applying a weighted inner product into an Arnoldi-type method. Wen et al. [14] developed an adaptive Power-GArnoldi algorithm by making use of the power method and the adaptive GArnoldi method together. Motivated by these works, with the aim of accelerating the adaptive GArnoldi method, a new method is proposed by periodically knitting the PET method with the adaptive GArnoldi method for PageRank problems. The new method is denoted as GArnoldi-PET method. Convergence performance of our proposed method is studied in detail, and numerical results are used to show its feasibility and effectiveness.
The remainder of this paper is organized as follows. In Section 2, we briefly introduce the PET method and the adaptive GArnoldi method for PageRank problems. In Section 3, we propose the GArnoldi-PET method and discuss its convergence. In Section 4, numerical results and comparisons are reported. Finally, conclusions are given in Section 5.
In this section, we give simple introductions of the PET method and the adaptive GArnoldi method for computing PageRank.
Here, we first give the algorithmic version of the PET method for PageRank problems as follows, see [6] for more details.
Algorithm 1. The PET method |
Input: an initial guess x(0), a prescribed tolerance tol, a positive integer m1, r=1 and k=0. |
Output: PageRank vector x. |
1. Compute the number of dangling nodes l and μ=1+α(ln−1). |
2. Run the power iteration m1 steps to obtain x(m1−1) and x(m1). |
2.1. for i=1:m1 |
2.2. x(i)=Ax(i−1); |
2.3. r=‖x(i)−x(i−1)‖2; |
2.4. x(i)=x(i)/‖x(i)‖1; |
2.5. if r≤tol, break; endif |
2.6. end |
3. Use the extrapolation scheme based on x(m1−1),x(m1) and μ: |
3.1. x(0)=x(m1)−(μ−1)x(m1−1); |
3.2. x(0)=x(0)/‖x(0)‖1; |
3.3. r=‖x(0)−x(m1)‖2; |
3.4. if r≤tol, break; else, goto step 2; endif |
Now, some illustrations of Algorithm 1 are given as follows.
● In step 1, the parameter μ is the trace of the Google matrix A.
● In step 2, the power method is run m1 steps, which means the extrapolation technique is not employed to the power method in each iteration, but is used every m1 power iterations.
● In step 3, we can see that the extrapolation strategy based on x(m1−1),x(m1) and μ is easy to implement as given in step 3.1.
As described in [13], let G=(gij) be an n×n symmetric positive definite (SPD) matrix, then the GArnoldi process based on a weighted inner product is presented as Algorithm 2.
Algorithm 2. The GArnoldi process |
Input: an initial vector v1, and the steps m of GArnoldi process, a SPD matrix G. |
Output: Vm, Hm. |
1. Compute ˜v1=v1/‖v1‖G. |
2. for j=1,2,⋯,m |
3. q=A˜vj; |
4. for i=1,2,⋯,j |
5. hi,j=(q,˜vi)G, q=q−hi,j˜vi; |
6. end |
7. hj+1,j=‖q‖G; |
8. if hj+1,j=0, break; endif |
9. ˜vj+1=q/hj+1,j; |
10. end |
In Algorithm 2, (⋅,⋅)G is a G-inner product defined as (x,y)G=xTGy,∀x∈Rn,y∈Rn, and ‖⋅‖G is a G-norm defined by
![]() |
(2.1) |
where Q∈Rn×n is an orthogonal matrix, D=diag{d1,d2,⋯,dn} is a n×n diagonal matrix with di>0, i=1,2,⋯,n, and G=QTDQ is a diagonalized decomposition of G. Let em∈Rm be the m-th co-ordinate vector, then the GArnoldi process has the following relations [13]
AVm=VmHm+hm+1,mvm+1eTm=Vm+1Hm+1,m, VTmGAVm=Hm, Hm+1,m=(Hmhm+1,meTm), |
where the matrix Vk=[˜v1,˜v2,⋯,˜vk] (k=m,m+1) is an n×k G-orthogonal matrix, Hm=(hij) is an m×m Hessenberg matrix.
From Algorithm 2, it is obvious that different SPD matrices G will lead to different GArnoldi methods. Since every SPD matrix can be diagonalized, for simplicity, we let G=diag{d1,d2,⋯,dn}, di>0, i=1,2,⋯,n. It is seen that in each outer iteration of the GArnoldi method, we hope to find a vector v satisfying min‖Av−v‖G, where v is taken from a Krylov subspace Km(A,v1)=span(v1,Av1,⋯,Am−1v1). Denote r=Av−v≜[r1,r2,⋯,rn]T, it has min‖Av−v‖G=min√∑ni=1dir2i, which leads to a weighted least squares problem where di is actually the weight for the i-th component of residual ri, i=1,2,⋯,n. In order to speed up the computation of PageRank problems, Yin et al. [13] changed the weights adaptively according to the changing of the current residual corresponding to the approximate PageRank vector. Therefore, one choice of the matrix G is that
G=diag{d1,d2,⋯,dn}, di=|ri|/‖r‖1, i=1,2,⋯,n, |
where r is the residual vector computed by the last calculation, and ∑ni=1di=1. And the algorithmic version of the adaptive GArnoldi algorithm for computing PageRank is presented as Algorithm 3.
Algorithm 3. The adaptive GArnoldi method |
Input: an initial vector x(0), the steps m of the GArnoldi process, a prescribed tolerance tol. |
Output: PageRank vector x. |
1. Set G=I. |
2. for i=1,2,⋯, until convergence, |
3. Run Algorithm 2 for computing Vm, Vm+1 and Hm+1,m. |
4. Compute singular value decomposition UΣST=Hm+1,m−[I;0]T. |
5. Compute x=Vmsm, r=σmVm+1um. |
6. if ‖r‖2≤tol, break; endif |
7. Set G=diag{|r|/‖r‖1}. |
8. end |
Note that, in step 5 of Algorithm 3, sm and um denote the right and left singular vector of Hm+1,m−[I;0]T associated with the minimal singular value σm, respectively.
In order to accelerate the computation of PageRank problems, we develop a new method by combining the PET method with the adaptive GArnoldi method. The new method is called GArnoldi-PET method. Here we first describe the construction of the GArnoldi-PET method, and then discuss its convergence.
As described in the subsection 2.1, based on the trace of the Google matrix A, an extrapolation strategy has been presented to speed up the convergence of the power method. Numerical experiments in [6] have illustrated that the PET method has a faster convergence than the power method when the damping factor α is close to 1. On the other hand, since the Arnoldi method is more computationally intense than applying the same number of iterations of the power method [8], thus it is natural to consider using the extrapolation strategy based on trace and the adaptive GArnoldi method together.
Similar to the construction of the Power-Arnoldi algorithm [8], the mechanism of our proposed method can be presented as follows: Given a unit positive vector x(0), and an approximate PageRank vector is obtained by iterating the adaptive GArnoldi method (Algorithm 3) for a few times (e.g., 2–3 times). If this approximate PageRank vector does not satisfy our prescribed tolerance, then we run the PET method to obtain another approximate vector with the resulting vector as the initial guess. If this approximate PageRank vector still can not satisfy our accuracy, then we return to Algorithm 3 with the new approximation as the starting vector. Repeating the above procedure until the described accuracy is achieved.
There is a problem about how to control the conversion between the PET method and the adaptive GArnoldi method. Many strategies have been developed to deal with this problem. Here, as given in [8], three parameters β, restart, maxit are used to control the procedure. Let τ(curr) be the residual norm of the current iteration, and τ(prev) be the residual norm of the previous iteration. Computing ratio=τ(curr)/τ(prev), if ratio>β, then restart=restart+1. If restart≥maxit, then we terminate the PET method and trigger the adaptive GArnoldi method. The specific implementation of the GArnoldi-PET method is given as follows.
Algorithm 4. The GArnoldi-PET method |
Input: an initial guess x(0), the dimension of the Krylov subspace m, a prescribed tolerance tol, the parameters β, maxit and m1. Set k=1, restart = 0, τ=1, τ0=τ, τ1=τ. |
Output: PageRank vector x. |
1. Compute the number of dangling nodes l and μ=1+α(ln−1). |
2. Run Algorithm 3 for a few times (2–3 times): iterate all steps of Algorithm 3 for the first run and steps 2–8 otherwise. If the approximation is satisfactory, then stop, else continue. |
3. Run the modified PET method with the resulting vector ˜x1 as the initial guess, where ˜x1 is obtained from the adaptive GArnoldi method: |
3.1. restart=0; |
3.2. while restart<maxit&τ>tol |
3.3. ratio=0; τ0=τ;τ1=τ; |
3.4. while ratio<β&τ>tol |
3.5. x(k)=Ax(k−1); x(k)=x(k)/‖x(k)‖1; |
3.6. r=x(k)−x(k−1);τ=‖r‖2; |
3.7. if mod(k,m1)=0 |
3.8. x(0)=x(k)−(μ−1)x(k−1); x(0)=x(0)/‖x(0)‖1; |
3.9. r=x(0)−x(k);τ=‖r‖2; x(k)=x(0); |
3.10. if τ≤tol, break; endif |
3.11. end |
3.12. ratio=τ/τ0; τ0=τ; k=k+1; |
3.13. end |
3.14. if τ/τ1>β, restart=restart+1; endif |
3.15. end |
3.16. if τ≤tol, stop, else set G=diag{|r|‖r‖1} and goto step 2. |
Now, some remarks about the GArnoldi-PET method are given as follows.
● As shown in the step 3.16, the matrix G is adaptively changed according to the current residual.
● According to the construction of the GArnoldi-PET method, it is natural to treat the PET method as an accelerated technique for the adaptive GArnoldi method.
● In each iteration of the GArnoldi-PET method, the storage requirements are approximately m+1 length−n vectors in the adaptive GArnoldi method and two vectors in the PET method. Its main computational cost consists of m matrix-vector products, m(m+1)2 inner products in the adaptive GArnoldi method and one matrix-vector product in the PET method.
Here we discuss the convergence analysis of the GArnoldi-PET algorithm. Particularly, we focus on the procedure when turning from the PET method to the adaptive GArnodli method.
Assume σ(A) denote the set of eigenvalues of the Google matrix A, and its eigenvalues are arranged as 1=|λ1|>|λ2|≥⋯≥|λn|. Let Lm−1 be the set of polynomials of degree not exceeding m−1, and Km(A,v1) be a Krylov subspace. If (λi,φi),i=1,2,⋯,n are the eigenpairs of A, and (˜λj,˜yj),j=1,2,⋯,m are the eigenpairs of Hm, then ˜λj is often used to approximate λj, and ˜φj=Vm˜yj is applied to approximate φj in the standard Arnoldi method. However, instead of using Ritz vectors ˜φj as approximate eigenvectors, Jia [15] proposed a new strategy such that for each ˜λj, a unit norm vector ˜uj∈Km(A,v1) satisfying the condition
‖(A−˜λjI)˜uj‖2=minu∈Km(A,v1)‖(A−˜λjI)u‖2 | (3.1) |
is used to approximate φj. Here ˜uj is called a refined approximate eigenvector corresponding to λj. The convergence of the refined Arnoldi method is given as follows.
Theorem 1 [15]. Assume that v1=∑ni=1γixi with respect to the eigenbasis {xi}i=1,2,⋯,n in which ‖xi‖2=1,i=1,2,⋯,n and γi≠0, let S=[x1,x2,⋯,xn], and
ξj=∑i≠j|λi−˜λj|⋅|γi||γj|. |
Then
‖(A−˜λjI)˜uj‖2≤σmax(S)σmin(S)(|λj−˜λj|+ξjminp∈Lm−1,p(λj)=1maxi≠j|p(λi)|), |
where ˜uj is a refined approximate eigenvector as above, σmax(S) and σmin(S) are the largest and smallest singular value of the matrix S, respectively.
Before analyzing the convergence of the GArnoldi-PET algorithm, some useful conclusions are introduced as follows.
Lemma 1 [14]. Let G=diag{d1,d2,⋯,dn}, di>0,1≤i≤n, be a diagonal matrix. Then for any vector x∈Rn, according to the definitions of the G-norm and the 2-norm, it has
min1≤i≤ndi⋅‖x‖22≤‖x‖2G≤max1≤i≤ndi⋅‖x‖22. | (3.2) |
Lemma 2 [10]. Let v1 be the initial vector for the PET method, which is from the previous adaptive GArnoldi method. Then the PET iteration in Algorithm 4 produces the vector
vnew1=ηTkv1,T=Am1−1[A−(μ−1)I], | (3.3) |
where k≥maxit, η is the normalizing factor, μ is the trace of the matrix A, m1 is a given number, T is called as the iterative matrix and I is an n×n identity matrix.
Theorem 2 [16]. Assume that the spectrum of the column-stochastic matrix P is {1,λ2,⋯,λn}, then the spectrum of the matrix A=αP+(1−α)veT is {1,αλ2,⋯,αλn}, where 0<α<1, v is a vector with nonnegative elements such that eTv=1.
Theorem 3 [17]. Let P be an n×n column-stochastic matrix. Let α be a real number such that 0<α<1. E is the n×n rank-one column-stochastic matrix E=veT, where e is the n-vector of all ones and v is an n-vector whose elements are all non-negative and sum to 1, A=αP+(1−α)veT is the n×n column-stochastic matrix, then its dominant eigenvalue λ1=1, |λ2|≤α.
In the next cycle of the GArnoldi-PET method, vnew1 will be used as the initial vector for an m-step GArnoldi process, so that the new Krylov subspace
Km(A,vnew1)=span(vnew1,Avnew1,⋯,Am−1vnew1) |
is constructed. The following theorem shows the convergence of the GArnoldi-PET method.
Theorem 4. Assume that v1=∑ni=1γixi with respect to the eigenbasis {xi}i=1,2,⋯,n in which ‖xi‖2=1,i=1,2,⋯,n and γ1≠0. Let G=diag{d1,d2,⋯,dn}, di>0,i=1,2,3,⋯,n, S=[x1,x2,x3,⋯,xn], and
ξ=n∑i=2|λi−1||γi||γ1|,ζ=√max1≤i≤ndimin1≤i≤ndi. |
Then
‖(A−I)u‖G≤ξ⋅ζσmin(S)⋅(αm1−1(α−μ+1)2−μ)k⋅minp∈Lm−1,p(λ1)=1maxλ∈σ(A)/{λ1}|p(λ)|, |
where u is taken from the Krylov subspace Km(A,v1), μ=1+α(ln−1) with the number of dangling nodes l, k≥maxit, σmin(S) is the smallest singular value of the matrix S.
Proof. Let {1,π2,⋯,πn} be the eigenvalue set of the matrix P, then {1,λ2=απ2,⋯,αn=απn} is the eigenvalue set of the matrix A by Theorem 2. According to (3.3), we have
Txi={(2−μ)xi, i=1λm1−1i(λi−μ+1)xi, i=2,3,⋯,n. |
Let φi(i=1,2,⋯,n) be eigenvalues of the iterative matrix T. Then, we have
φi={2−μ, i=1λm1−1i(λi−μ+1), i=2,3,⋯,n. | (3.4) |
Since μ=1+α(ln−1) with the number of dangling nodes l, it has 1−μ=α(1−ln)≥0. On the other hand, according to Theorem 3 and 1=|λ1|>|λ2|≥⋯≥|λn|, it has |λi|≤α and
−α+μ−1≤−α+1−μ≤λi+1−μ≤α+1−μ,i=2,⋯,n. |
Hence, we obtain |λi−μ+1|≤α−μ+1,i=2,⋯,n, and
|φi|≤αm1−1(α−μ+1)<2−μ, i=2,3,⋯,n. | (3.5) |
For any u∈Km(A,vnew1), there exists q(x)∈Lm−1 such that
‖(A−I)u‖G=minq∈Lm−1‖(A−I)q(A)vnew1‖G‖q(A)vnew1‖G=minq∈Lm−1‖(A−I)q(A)ηTkv1‖G‖q(A)ηTkv1‖G=minq∈Lm−1‖(A−I)q(A)Tkγ1x1+n∑i=2(A−I)q(A)Tkγixi‖G‖n∑i=1q(A)Tkγixi‖G=minq∈Lm−1‖n∑i=2(λi−1)q(λi)φkiγixi‖G‖n∑i=1q(λi)φkiγixi‖G, | (3.6) |
where we used the facts that λ1=1, Axi=λixi,Txi=φixi,i=1,2,⋯,n. According to (3.2) and (3.5), for the numerator of (3.6), it has
‖n∑i=2(λi−1)q(λi)φkiγixi‖G≤√max1≤i≤ndi⋅‖n∑i=2(λi−1)q(λi)φkiγixi‖2≤√max1≤i≤ndi⋅n∑i=2|λi−1|⋅|φki|⋅|γi|⋅|q(λi)|≤√max1≤i≤ndi⋅n∑i=2[αm1−1(α−μ+1)]k⋅|λi−1|⋅|γi|⋅|q(λi)|. | (3.7) |
For the denominator of (3.6), it has
‖n∑i=1q(λi)φkiγixi‖2G≥min1≤i≤ndi⋅‖n∑i=1q(λi)φkiγixi‖22≥min1≤i≤ndi⋅σ2min(S)⋅n∑i=1|φki|2⋅|γi|2⋅|q(λi)|2. | (3.8) |
Combining (3.5), (3.7) and (3.8) into (3.6), we have
‖(A−I)u‖G≤minq∈Lm−1√max1≤i≤ndi⋅n∑i=2[αm1−1(α−μ+1)]k⋅|λi−1|⋅|γi|⋅|q(λi)|√min1≤i≤ndi⋅σ2min(S)n∑i=1|φki|2⋅|γi|2⋅|q(λi)|2≤1σmin(S)⋅√max1≤i≤ndimin1≤i≤ndi⋅minq∈Lm−1n∑i=2[αm1−1(α−μ+1)]k⋅|λi−1|⋅|γi|⋅|q(λi)|(2−μ)k⋅|γ1|⋅|q(λ1)|≤1σmin(S)⋅√max1≤i≤ndimin1≤i≤ndi⋅(αm1−1(α−μ+1)2−μ)k⋅minq∈Lm−1n∑i=2|λi−1||γi||γ1|⋅|q(λi)||q(λ1)|, |
where αm1−1(α−μ+1)2−μ<1. Let p(λ)=q(λ)/q(1), where p(1)=1, then we get
‖(A−I)u‖G≤ξ⋅ζσmin(S)⋅(αm1−1(α−μ+1)2−μ)k⋅minp∈Lm−1,p(λ1)=1maxλ∈σ(A)/{λ1}|p(λ)|. |
In this section, we test the effectiveness of the GArnoldi-PET method and compare it with the PET method (Algorithm 1) [6], the Power-Arnoldi algorithm (called as PA) [8] and the adaptive GArnoldi method (called as A-Arnoldi) [13] in terms of the number of matrix-vector products (Mv) and the computing time in seconds (CPU). All the numerical results are obtained by using MATLAB 2018b on the Windows 10 (64 bit) operating system with 1.7 GHz Intel(R) Core(TM) i5 CPU and RAM 4.00 GB.
In Table 1, we list the characteristics of test matrices including the matrix size (n), the number of nonzero elements (nnz), the number of dangling nodes (numd) and the density (den) which is defined by den=nnzn×n×100.
Name | n | nnz | numd | den |
wb-cs-stanford | 9,914 | 36,854 | 2,861 | 0.375×10−1 |
web-Stanford | 281903 | 2,312,497 | 172 | 0.291×10−2 |
wikipedia-20051105 | 1,634,989 | 19,753,078 | 72,556 | 0.739×10−3 |
For a fair comparison, all algorithms use the same initial guess x(0)=v=e/n with e=[1,1,⋯,1]T. The tolerance is chosen as tol=10−8. The values of the damping factor α are 0.990, 0.993, 0.995 and 0.997, respectively. The parameter β=α−0.1. We run the thick restarted Arnoldi procedure, with the number of approximate eigenpairs p, two times per cycle in the Power-Arnoldi method. Similarly, we run the adaptive GArnoldi procedure two times per cycle in the GArnoldi-PET method. In addition, for describing the efficiency of the GArnoldi-PET method, we define
speedup=CPUPA−CPUGArnoldi-PETCPUPA×100%. |
Example 1. The first test matrix is the wb-cs-stanford matrix, which contains 9914 pages, 36854 links and 2861 dangling nodes. It is available from https://sparse.tamu.edu/Gleich/wb-cs-stanford. In this example, we set the parameters m=5,p=3,maxit=6 and m1=40. Numerical results of the PET method, the adaptive GArnoldi method, the Power-Arnoldi algorithm and the GArnoldi-PET algorithm are reported in Table 2. Figure 1 plots the convergence history of the four methods with different values of α.
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 712 | 290 | 169 | 158 |
CPU | 0.1805 | 0.1589 | 0.0727 | 0.0692 |
speedup | 4.81% | |||
α=0.993 | ||||
Mv | 960 | 350 | 238 | 194 |
CPU | 0.2045 | 0.1862 | 0.0961 | 0.0801 |
speedup | 16.65% | |||
α=0.995 | ||||
Mv | 1253 | 400 | 305 | 211 |
CPU | 0.2916 | 0.1965 | 0.1325 | 0.0945 |
speedup | 28.68% | |||
α=0.997 | ||||
Mv | 1804 | 530 | 362 | 255 |
CPU | 0.3515 | 0.2778 | 0.1473 | 0.1038 |
speedup | 29.53% |
From Table 2, we can see that the Power-Arnoldi algorithm works better than the PET method and the adaptive GArnoldi method in terms of the number of matrix-vector products and the computing time. However, the GArnoldi-PET algorithm performs the best. For example, when α=0.997, the Power-Arnoldi algorithm needs 0.1473 seconds to reach the desired accuracy, while the GArnoldi-PET algorithm only uses 0.1038 seconds, and the speedup is 29.53%.
From Figure 1, it is easy to find that the GArnoldi-PET algorithm has a faster convergence speed than the PET method and Power-Arnoldi algorithm, even though its iteration counts are slightly inferior to the adaptive GArnoldi method. Obviously, only the number of iterations can not describe the whole story.
Example 2. The second test matrix is the web-Stanford matrix, which contains 281903 nodes, 2312497 links and 172 dangling nodes. It is available from https://sparse.tamu.edu/SNAP/web-Stanford. In this example, we choose the parameters m=5,p=3,maxit=12 and m1=35. Numerical results of the PET method, the adaptive GArnoldi method, the Power-Arnoldi algorithm and the GArnoldi-PET algorithm are given in Table 3. Figure 2 depicts the convergence of the four methods with different values of α
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 717 | 715 | 303 | 273 |
CPU | 11.4552 | 24.1209 | 7.4254 | 6.5659 |
speedup | 11.58% | |||
α=0.993 | ||||
Mv | 966 | 905 | 395 | 349 |
CPU | 14.6271 | 29.3734 | 9.0469 | 8.4525 |
speedup | 6.57% | |||
α=0.995 | ||||
Mv | 1281 | 1185 | 464 | 397 |
CPU | 19.6235 | 38.7536 | 11.6585 | 9.6601 |
speedup | 17.14% | |||
α=0.997 | ||||
Mv | 1937 | 1795 | 601 | 481 |
CPU | 29.5376 | 58.5102 | 13.7626 | 10.7224 |
speedup | 22.09% |
From Table 3, it observes that the GArnoldi-PET algorithm outperforms the other three methods in terms of the number of matrix-vector products and the computing time. Although the speedup is only 6.57% relative to the Power-Arnoldi algorithm when α=0.993. However, when α increases, e.g., α=0.997, the Power-Arnoldi algorithm needs 13.7626 seconds to reach the desired accuracy, the GArnoldi-PET algorithm only takes 10.7224 seconds, and the speedup becomes 22.09%.
From Figure 2, it shows that the GArnoldi-PET algorithm converges faster than the PET method and the Power-Arnoldi algorithm. When α is close to one, e.g., α=0.995 and α=0.997, the iteration counts of the GArnoldi-PET algorithm are less than those of the adaptive GArnoldi method. This suggests that our new algorithm has some potential.
Example 3. The last test matrix is the wikipedia-20051105 matrix, which contains 1634989 nodes, 19753078 links and 72556 dangling nodes. It is available from https://sparse.tamu.edu/Gleich/wikipedia-20051105. In this example, we make the parameters m=5,p=3,maxit=8 and m1=50. Numerical results of the PET method, the adaptive GArnoldi method, the Power-Arnoldi algorithm and the GArnoldi-PET algorithm are listed in Table 4. Figure 3 shows the convergence curves of the four methods with different values of α.
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 555 | 380 | 124 | 104 |
CPU | 110.2113 | 113.5793 | 30.1254 | 25.1583 |
speedup | 16.49% | |||
α=0.993 | ||||
Mv | 793 | 475 | 162 | 121 |
CPU | 156.2311 | 139.5610 | 39.0841 | 29.0992 |
speedup | 25.55% | |||
α=0.995 | ||||
Mv | 1111 | 595 | 171 | 141 |
CPU | 221.5114 | 177.1960 | 43.4769 | 33.0091 |
speedup | 24.08% | |||
α=0.997 | ||||
Mv | 1849 | 830 | 239 | 172 |
CPU | 356.8225 | 247.7716 | 56.5771 | 41.9286 |
speedup | 25.89% |
From Table 4, we also see that the GArnoldi-PET algorithm makes great improvements on the PET method, the adaptive GArnoldi method and the Power-Arnoldi algorithm in terms of the number of matrix-vector products and the computing time. For a large damping factor such as α=0.997, the Power-Arnoldi algorithm takes 56.5771 seconds to reach the desired accuracy, while the GArnoldi-PET algorithm takes 41.9286 seconds to achieve the same accuracy, and the speedup is 25.89%.
From Figure 3, we again find that the GArnoldi-PET algorithm converges faster than the PET method, the adaptive GArnoldi method and the Power-Arnoldi algorithm. For different values of α, the iteration counts of the GArnoldi-PET algorithm are the least.
In this paper, by combining the PET method with the adaptive GArnoldi method, we propose a new method called as GArnoldi-PET method for accelerating the computation of PageRank problems. Its construction and theoretical analysis can be found in Section 3. Numerical results in Section 4 show that our proposed method is quite efficient and better than the existing methods, especially when the damping factor is close to 1. However, much research still needs further study. For example, determining the optimal choice of the parameters, or considering to use some preconditioning strategies as given in [18,19,20] for the GArnoldi method.
This research is supported by Science Challenge Project (TZ2016002-TZZT2019-B1.4), the Key Fund Project of Sichuan Provincial Department of Education (17za0003) and the Applied Basic Research Project of Sichuan Province (2020YJ0007). We would like to thank the anonymous referees for their valuable comments and suggestions.
All authors declare that there is no conflict of interest in this paper.
[1] | L. Page, S. Brin, R. Motwami, T. Winograd, The PageRank citation ranking: Bringing order to the web, Technical report, Computer Science Department, Stanford University, Stanford, CA, 1999. |
[2] | S. Kamvar, T. Haveliwala, C. Manning, G. H. Golub, Exploiting the block structure of the web for computing PageRank, Technical Report, SCCM-03-02, Stanford University, 2003. |
[3] | G. H. Golub, C. F. Van Loan, Matrix Computations, third ed., The Johns Hopkins University Press, Baltimore, London, 1996. |
[4] |
D. Gleich, A. Gray, C. Greif, T. Lau, An inner-outer iteration for computing PageRank, SIAM J. Sci. Comput., 32 (2010), 349-371. doi: 10.1137/080727397
![]() |
[5] | Z. L. Tian, Y. Liu, Y. Zhang, Z. Y. Liu, M. Y. Tian, The general inner-outer iteration method based on regular splittings for the PageRank problem, Appl. Math. Comput., 356 (2019), 479-501. |
[6] |
X. Y. Tan, A new extrapolation method for pagerank computations, J. Comput. Appl. Math., 313 (2017), 383-392. doi: 10.1016/j.cam.2016.08.034
![]() |
[7] |
G. H. Golub, C. Greif, An Arnoldi-type algorithm for computing PageRank, BIT., 46 (2006), 759-771. doi: 10.1007/s10543-006-0091-y
![]() |
[8] |
G. Wu, Y. Wei, A Power-Arnoldi algorithm for computing pagerank, Numer. Linear Algebra Appl., 14 (2007), 521-546. doi: 10.1002/nla.531
![]() |
[9] |
R. Morgan, M. Zheng, A harmonic restarted Arnoldi algorithm for calculating eigenvalues and determining multiplicity, Linear Algebra Appl., 415 (2006), 96-113. doi: 10.1016/j.laa.2005.07.024
![]() |
[10] |
Q. Y. Hu, C. Wen, T. Z. Huang, Z. L. Shen, X. M. Gu, A variant of the Power-Arnoldi algorithm for computing PageRank, J. Comput. Appl. Math., 381 (2021), 113034. doi: 10.1016/j.cam.2020.113034
![]() |
[11] |
A. Essai, Weighted FOM and GMRES for solving nonsymmetric linear systems, Numer. Algorithms, 18 (1998), 277-292. doi: 10.1023/A:1019177600806
![]() |
[12] | H. S. Najafi, H. Ghazvini, Weighted restarting method in the weighted Arnoldi algorithm for computing the eigenvalues of a nonsymmetric matrix, Appl. Math. Comput., 175 (2006), 1276-1287. |
[13] |
J. F. Yin, G. J. Yin, M. Ng, On adaptively accelerated Arnoldi method for computing PageRank, Numer. Linear Algebra Appl., 19 (2012), 73-85. doi: 10.1002/nla.789
![]() |
[14] |
C. Wen, Q. Y. Hu, G. J. Yin, X. M. Gu, Z. L. Shen, An adaptive Power-GArnoldi algorithm for computing PageRank, J. Comput. Appl. Math., 386 (2021), 113209. doi: 10.1016/j.cam.2020.113209
![]() |
[15] |
Z. X. Jia, Refined iterative algorithms based on Arnoldi's process for large unsymmetric eigenproblems, Linear Algebra Appl., 259 (1997), 1-23. doi: 10.1016/S0024-3795(96)00238-8
![]() |
[16] | A. Langville, C. Meyer, Googles PageRank and Beyond: The Science of the Search Engine Rankings, Princeton University Press, 2006. |
[17] | T. Haveliwala, S. Kamvar, The second eigenvalue of the google matrix, in: Proceedings of the Twelfth International World Wide Web of Conference, 2003. |
[18] | F. Tudisco, C. Di Fiore, A preconditioning approach to the PageRank computation problem, Linear Algebra Appl., 435 (2011), 2222-2246. |
[19] |
S. Cipolla, C. Di Fiore, F. Tudisco, Euler-Richardson method preconditioned by weakly stochastic matrix algebras: A potential contribution to Pagerank computation, Electronic J. Linear Al., 32 (2017), 254-272. doi: 10.13001/1081-3810.3343
![]() |
[20] |
G. Wu, Y. C. Wang, X. Q. Jin, A preconditioned and shifted GMRES algorithm for the PageRank problem with multiple damping factors, SIAM J. Sci. Comput., 34 (2012), 2558-2575. doi: 10.1137/110834585
![]() |
1. | Yu Jin, Chun Wen, Zhao-Li Shen, Acceleration of the generalized FOM algorithm for computing PageRank, 2022, 30, 2688-1594, 732, 10.3934/era.2022039 | |
2. | Sanaullah Mastoi, Abdul Hamid Ganie, Abdulkafi Mohammed Saeed, Umair Ali, Umair Ahmed Rajput, Wan Ainun Mior Othman, Numerical solution for two-dimensional partial differential equations using SM’s method, 2022, 20, 2391-5471, 142, 10.1515/phys-2022-0015 | |
3. | Yu Jin, Chun Wen, Zhao-Li Shen, Xian-Ming Gu, A simpler GMRES algorithm accelerated by Chebyshev polynomials for computing PageRank, 2022, 413, 03770427, 114395, 10.1016/j.cam.2022.114395 | |
4. | Geqiao Liu, Mingjie Tan, Xiao Ling Wang, Improved Double-Layer Structure Multilabel Classification Model via Optimal Sequence and Attention Mechanism, 2022, 2022, 1099-0526, 1, 10.1155/2022/7413588 | |
5. | Yajun Xie, Lihua Hu, Changfeng Ma, A Parameterized Multi-Splitting Iterative Method for Solving the PageRank Problem, 2023, 11, 2227-7390, 3320, 10.3390/math11153320 |
Name | n | nnz | numd | den |
wb-cs-stanford | 9,914 | 36,854 | 2,861 | 0.375×10−1 |
web-Stanford | 281903 | 2,312,497 | 172 | 0.291×10−2 |
wikipedia-20051105 | 1,634,989 | 19,753,078 | 72,556 | 0.739×10−3 |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 712 | 290 | 169 | 158 |
CPU | 0.1805 | 0.1589 | 0.0727 | 0.0692 |
speedup | 4.81% | |||
α=0.993 | ||||
Mv | 960 | 350 | 238 | 194 |
CPU | 0.2045 | 0.1862 | 0.0961 | 0.0801 |
speedup | 16.65% | |||
α=0.995 | ||||
Mv | 1253 | 400 | 305 | 211 |
CPU | 0.2916 | 0.1965 | 0.1325 | 0.0945 |
speedup | 28.68% | |||
α=0.997 | ||||
Mv | 1804 | 530 | 362 | 255 |
CPU | 0.3515 | 0.2778 | 0.1473 | 0.1038 |
speedup | 29.53% |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 717 | 715 | 303 | 273 |
CPU | 11.4552 | 24.1209 | 7.4254 | 6.5659 |
speedup | 11.58% | |||
α=0.993 | ||||
Mv | 966 | 905 | 395 | 349 |
CPU | 14.6271 | 29.3734 | 9.0469 | 8.4525 |
speedup | 6.57% | |||
α=0.995 | ||||
Mv | 1281 | 1185 | 464 | 397 |
CPU | 19.6235 | 38.7536 | 11.6585 | 9.6601 |
speedup | 17.14% | |||
α=0.997 | ||||
Mv | 1937 | 1795 | 601 | 481 |
CPU | 29.5376 | 58.5102 | 13.7626 | 10.7224 |
speedup | 22.09% |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 555 | 380 | 124 | 104 |
CPU | 110.2113 | 113.5793 | 30.1254 | 25.1583 |
speedup | 16.49% | |||
α=0.993 | ||||
Mv | 793 | 475 | 162 | 121 |
CPU | 156.2311 | 139.5610 | 39.0841 | 29.0992 |
speedup | 25.55% | |||
α=0.995 | ||||
Mv | 1111 | 595 | 171 | 141 |
CPU | 221.5114 | 177.1960 | 43.4769 | 33.0091 |
speedup | 24.08% | |||
α=0.997 | ||||
Mv | 1849 | 830 | 239 | 172 |
CPU | 356.8225 | 247.7716 | 56.5771 | 41.9286 |
speedup | 25.89% |
Name | n | nnz | numd | den |
wb-cs-stanford | 9,914 | 36,854 | 2,861 | 0.375×10−1 |
web-Stanford | 281903 | 2,312,497 | 172 | 0.291×10−2 |
wikipedia-20051105 | 1,634,989 | 19,753,078 | 72,556 | 0.739×10−3 |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 712 | 290 | 169 | 158 |
CPU | 0.1805 | 0.1589 | 0.0727 | 0.0692 |
speedup | 4.81% | |||
α=0.993 | ||||
Mv | 960 | 350 | 238 | 194 |
CPU | 0.2045 | 0.1862 | 0.0961 | 0.0801 |
speedup | 16.65% | |||
α=0.995 | ||||
Mv | 1253 | 400 | 305 | 211 |
CPU | 0.2916 | 0.1965 | 0.1325 | 0.0945 |
speedup | 28.68% | |||
α=0.997 | ||||
Mv | 1804 | 530 | 362 | 255 |
CPU | 0.3515 | 0.2778 | 0.1473 | 0.1038 |
speedup | 29.53% |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 717 | 715 | 303 | 273 |
CPU | 11.4552 | 24.1209 | 7.4254 | 6.5659 |
speedup | 11.58% | |||
α=0.993 | ||||
Mv | 966 | 905 | 395 | 349 |
CPU | 14.6271 | 29.3734 | 9.0469 | 8.4525 |
speedup | 6.57% | |||
α=0.995 | ||||
Mv | 1281 | 1185 | 464 | 397 |
CPU | 19.6235 | 38.7536 | 11.6585 | 9.6601 |
speedup | 17.14% | |||
α=0.997 | ||||
Mv | 1937 | 1795 | 601 | 481 |
CPU | 29.5376 | 58.5102 | 13.7626 | 10.7224 |
speedup | 22.09% |
α | PET | A-Arnoldi | PA | GArnoldi-PET |
α=0.99 | ||||
Mv | 555 | 380 | 124 | 104 |
CPU | 110.2113 | 113.5793 | 30.1254 | 25.1583 |
speedup | 16.49% | |||
α=0.993 | ||||
Mv | 793 | 475 | 162 | 121 |
CPU | 156.2311 | 139.5610 | 39.0841 | 29.0992 |
speedup | 25.55% | |||
α=0.995 | ||||
Mv | 1111 | 595 | 171 | 141 |
CPU | 221.5114 | 177.1960 | 43.4769 | 33.0091 |
speedup | 24.08% | |||
α=0.997 | ||||
Mv | 1849 | 830 | 239 | 172 |
CPU | 356.8225 | 247.7716 | 56.5771 | 41.9286 |
speedup | 25.89% |