Research article Special Issues

On maximum residual block Kaczmarz method for solving large consistent linear systems

  • In this paper, we propose two block variants of the Kaczmarz method for solving large-scale consistent linear equations Ax=b. The first method, named the maximum residual block Kaczmarz (denoted as MRBK) method, first partitions the coefficient matrix, and then captures the largest block in the residual vector during each block iteration to ensure that it is eliminated first. Simultaneously, in order to avoid the pseudo-inverse calculation of the MRBK method during block iteration and improve the convergence speed, we further propose the maximum residual average block Kaczmarz method. This method completes the iterative process by projecting the current solution vector onto each row of the constrained subset of the matrix A and averaging it with different extrapolation steps. We analyze and prove the deterministic convergence of both methods. Numerical experiments validate our theory and show that our proposed methods are superior to some other block Kaczmarz methods.

    Citation: Wen-Ning Sun, Mei Qin. On maximum residual block Kaczmarz method for solving large consistent linear systems[J]. AIMS Mathematics, 2024, 9(12): 33843-33860. doi: 10.3934/math.20241614

    Related Papers:

    [1] Ke Zhang, Hong-Yan Yin, Xiang-Long Jiang . An efficient variant of the greedy block Kaczmarz algorithm for solving large linear systems. AIMS Mathematics, 2024, 9(1): 2473-2499. doi: 10.3934/math.2024122
    [2] Li Wen, Feng Yin, Yimou Liao, Guangxin Huang . A greedy average block Kaczmarz method for the large scaled consistent system of linear equations. AIMS Mathematics, 2022, 7(4): 6792-6806. doi: 10.3934/math.2022378
    [3] Ran-Ran Li, Hao Liu . The maximum residual block Kaczmarz algorithm based on feature selection. AIMS Mathematics, 2025, 10(3): 6270-6290. doi: 10.3934/math.2025286
    [4] Feng Yin, Bu-Yue Zhang, Guang-Xin Huang . A partially block randomized extended Kaczmarz method for solving large overdetermined inconsistent linear systems. AIMS Mathematics, 2023, 8(8): 18512-18527. doi: 10.3934/math.2023941
    [5] Abdulhakim A. Al-Babtain, Amal S. Hassan, Ahmed N. Zaky, Ibrahim Elbatal, Mohammed Elgarhy . Dynamic cumulative residual Rényi entropy for Lomax distribution: Bayesian and non-Bayesian methods. AIMS Mathematics, 2021, 6(4): 3889-3914. doi: 10.3934/math.2021231
    [6] Mansour Shrahili, Mohamed Kayid . Uncertainty quantification based on residual Tsallis entropy of order statistics. AIMS Mathematics, 2024, 9(7): 18712-18731. doi: 10.3934/math.2024910
    [7] Shu-Xin Miao, Jing Zhang . On Uzawa-SSI method for non-Hermitian saddle point problems. AIMS Mathematics, 2020, 5(6): 7301-7315. doi: 10.3934/math.2020467
    [8] Ted Hurley . Ultimate linear block and convolutional codes. AIMS Mathematics, 2025, 10(4): 8398-8421. doi: 10.3934/math.2025387
    [9] Mansour Shrahili, Mohamed Kayid, Mhamed Mesfioui . Stochastic inequalities involving past extropy of order statistics and past extropy of record values. AIMS Mathematics, 2024, 9(3): 5827-5849. doi: 10.3934/math.2024283
    [10] Miao Fu, Yuqin Zhang . Results on monochromatic vertex disconnection of graphs. AIMS Mathematics, 2023, 8(6): 13219-13240. doi: 10.3934/math.2023668
  • In this paper, we propose two block variants of the Kaczmarz method for solving large-scale consistent linear equations Ax=b. The first method, named the maximum residual block Kaczmarz (denoted as MRBK) method, first partitions the coefficient matrix, and then captures the largest block in the residual vector during each block iteration to ensure that it is eliminated first. Simultaneously, in order to avoid the pseudo-inverse calculation of the MRBK method during block iteration and improve the convergence speed, we further propose the maximum residual average block Kaczmarz method. This method completes the iterative process by projecting the current solution vector onto each row of the constrained subset of the matrix A and averaging it with different extrapolation steps. We analyze and prove the deterministic convergence of both methods. Numerical experiments validate our theory and show that our proposed methods are superior to some other block Kaczmarz methods.



    Consider the following large-scale system of linear equations:

    Ax=b,withARm×nandbRm, (1.1)

    where researchers often employ iterative methods [1,2] to solve them. A simple and effective method is the Kaczmarz method [3], which is extensively utilized across various large-scale computing fields, including computed tomography [4,5,6,7,8], image reconstruction [9,10,11], signal processing [11,12], distributed computing [13,14], etc. As a row iteration method, the classic Kaczmarz method iteratively updates the solution vector by selecting the working row of the coefficient matrix in sequence and projecting it orthogonally to the hyperplane where the row is located. To be more specific, let A(i) represent the ith row of A, and b(i) represent the ith entry of b. Given an initial approximation value x0, the Kaczmarz method can be expressed as follows:

    xk+1=xk+(b(ik)A(ik)xk)A(ik)22(A(ik))T

    where ()T and 2 denote the transpose and Euclidean norm of a vector or matrix, respectively, and the target row index ik=mod(k,m)+1.

    The theory of the Kaczmarz method has seen significant development since its inception. Initially, Kaczmarz [3] demonstrated the convergence of this method. Subsequently, Galántai [15] provided an upper bound on its convergence rate, while Knight conducted error analysis on it under a limited precision operation in [16]. In recent years, Bai and Liu [17] established a new convergence theorem for the Kaczmarz method using the block Meany inequality. For further research literature, please refer to [18,19].

    When the scale of the coefficient matrix is very large, the efficiency of the classic Kaczmarz method significantly decreases as it selects rows in a sequential manner to approximate the solution of the system of equations. In 2009, Strohmer and Vershynin [20] introduced the randomized Kaczmarz (RK) method with expected exponential rate of convergence for solving overdetermined consistent linear systems, reigniting interest in Kaczmarz methods. They refined the row selection strategy by proposing to select the working row with probability Pr(row=ik)=A(ik)22/A2F, resulting in a substantial acceleration of its convergence rate. Subsequently, Bai and Wu [21] proposed a greedy randomized Kaczmarz (GRK) method, which introduced a greedy probability criterion to obtain the larger component of the module of the residual vector in each iteration, so that it would be eliminated first in the iteration process, and thus accelerate the convergence rate. It is worth noting that Ansorge [22] proposed a maximal residual Kaczmarz (MRK) method, and Popa analyzed it in [23]. Similar to the working row obtained by the GRK method, this method selects the target working row index ik so that the ik-th component of the residual has relatively the largest absolute value compared to other components, i.e., ik=argmax1im|b(i)A(i)xk|. Based on the GRK method, Bai and Wu [24] proposed a more efficient method, named the relaxed greedy randomized Kaczmarz (RGRK) method, by introducing a relaxation factor. Zhang [25] developed a new greedy Kaczmarz method, and Bai and Wu proved a more precise convergence upper bound for the randomized Kaczmarz method in [26]. For further study of the randomized Kaczmarz method and its variants, see [27,28,29,30,31].

    For iterative solutions of large linear equations, in order to further accelerate the convergence rate of the Kaczmarz method, it is a natural idea to use block iteration instead of single-row iteration, so the block Kaczmarz method emerges as the times require. Bai and Liu proved the convergence of the block Kaczmarz method in [17]. Needell et al. pointed out in [32] that the matrix has good row paving, introduced a block strategy that depends on matrix eigenvalues, and proposed the first block Kaczmarz method with (expected) linear convergence to solve the overdetermined least squares problems, which projected the current iterative solution vector onto the solution space of the constrained subsets at each iteration step. To be specific, if the subset Jk is selected at the k-th iteration, let AJk denote the Moore-Penrose pseudo-inverse of AJk, and the iteration formula for xk can be expressed as:

    xk+1=xk+AJk(bJkAJkxk)

    with AJk=A(Jk,:) and bJk=b(Jk). Needell et al. then proposed a randomized block Kaczmarz (RBK) method for solving the least squares problem in [33]. As a natural follow-up to the RBK method, Liu and Gu [34] proposed the greedy randomized block Kaczmarz (GRBK) method for solving consistent linear systems. To avoid the high computational cost of pseudo-inverse calculations during block iterations, Necoara [35] proposed the randomized average block Kaczmarz (RABK) method, which projects the current iteration vector onto each row of the selected submatrix and averages them using different extrapolated step sizes. Subsequently, based on the idea of the average block method, Miao and Wu [36] extended the GRBK method to propose the greedy randomized average block Kaczmarz (GRABK) method, while Li et al. introduced the greedy average block Kaczmarz (GABK) method in [37]. Recently, Xiao et al. [38] combined the greedy strategy with the average block method to develop the fast greedy randomized block Kaczmarz method. For more research on the block Kaczmarz method, refer to [39,40,41,42,43].

    In this paper, combining the row selection strategy in the MRK method [22] with the row partitioning strategy in the RBK method [33], we construct the maximum residual block Kaczmarz (denoted as MRBK) method, which is designed to prioritize the elimination of the largest block within the residual vector rk during every iteration. Simultaneously, in order to avoid the pseudo-inverse calculation of the MRBK method during iteration, we further develop the maximum residual average block Kaczmarz (denoted as MRABK) method, which completes each iteration by projecting the current solution xk onto each row of AVik and applying different extrapolation step sizes to average them. We give the deterministic convergence analysis of these two methods. Numerical experiments show that the MRABK method outperforms the MRBK method, while both methods are more efficient than the GRK, MRK, RBK, and GRBK methods.

    The structure of this paper is outlined as follows. We introduce the maximum residual block Kaczmarz method and prove its convergence in Section 2. In Section 3, we establish the maximum residual average block Kaczmarz method and analyze its convergence. Section 4 features numerical experiments that validate the effectiveness of the proposed methods. Lastly, Section 5 ends the paper with conclusions.

    Some basic assumptions: For a real matrix ARm×n, A2, AF, and A signify the Euclidean norm, Frobenius norm, and Moore-Penrose pseudo-inverse of matrix A, respectively. We define A as a standardized matrix if each row has a Euclidean norm of 1, meaning A(i)2=1, for all i=1,2,,m. Similarly, for a given vector u, u2 also represents its Euclidean norm. The notation σmin(A) and σmax(A) are employed to express the smallest nonzero and largest singular values of matrix A, respectively. Additionally, let us define the set [m] as {1,2,...,m}, where m is an arbitrary positive integer. We consider the collection V={V1,V2,,Vt} to be a partition of [m] such that the index sets Vi (for i=1,2,,t) satisfy the conditions ViVj= for ij, and ti=1Vi=[m]. Furthermore, given a specified row index set Vi, we represent the row submatrix of matrix A that corresponds to Vi as AVi, and we denote the subvector of vector b as bVi. The identity matrix of appropriate dimensions is denoted by I. Define the randomized partition of [m] as V={V1,V2,,Vt} with

    Vi={π(k):k=(i1)m/t+1,(i1)m/t+2,,im/t} (1.2)

    where i=1,2,,t, and we assume that the row partition anywhere else in this paper is as shown in (1.2).

    In this section, drawing inspiration from the ideas behind the MRK [22], RBK [33], and GRBK [34] methods, we are going to construct the maximum residual block Kaczmarz (MRBK) method and analyze its convergence property.

    There are typically two approaches to accelerate the Kaczmarz method: the first approach focuses on selecting working rows more efficiently to achieve faster convergence in each iteration, while the second approach aims to utilize row block iteration instead of single-row iteration for acceleration. Building upon these approaches, we propose the MRBK method as follows as Method 1. First, we partition the rows of matrix A to obtain the row block division V of A, i.e., {AV1,AV2,,AVt}. Next, we denote r(i)k as the ith block component corresponding to the residual vector rk, and then r(i)k=bViAVixk, where i=1,2,,t. We select the working row block Vik according to ik=argmax1itbViAVixk22, ensuring that the largest block component of the residual vector rk is eliminated first in each iteration. This significantly improves the convergence rate.

    Next, we try to analyze the differences and improvements of the MRBK method compared to the other three methods:

    (1) The MRBK method versus the MRK method: The MRBK method accelerates the MRK method naturally by utilizing row block iterations instead of single-row iterations.

    (2) The MRBK method versus the RBK and GRBK methods: In comparison to the RBK method, the GRBK method improves the RBK method by introducing a new greedy probability criterion to randomly select the index of the row blocks, which ensures that row blocks with large residual values are prioritized for elimination, thus accelerating the RBK method. However, the GRBK method requires constructing the index set of row blocks and then selecting them based on probability in each iteration. In contrast, our proposed MRBK method selects the row blocks with the largest residual directly, enhancing the iteration efficiency. In fact, along with the idea of "greedy", our method can be called "extremely greedy" when it comes to selecting the index of the row blocks.

    Method 1 The MRBK Method
    Input: A, b, , x0.
    Output: x
    1: Let {V1,V2,,Vt} be a partition of [m]
    2: for k=0,1,,1 do
    3: Select ik=argmax1itbViAVixk22
    4: Set xk+1=xk+AVik(bVikAVikxk)
    5: end for

    Definition 2.1. [33] (Row paving) A row paving A(t,α,β) of an m×n matrix A is defined as a partition V={V1,V2,,Vt} of the rows such that for each ViV, the following inequalities hold:

    ασ2min(AVi)andσ2max(AVi)β,

    where t is referred to as the size of the paving, while α and β are known as the lower and upper paving bounds, respectively.

    For the MRBK method, Theorem 2.1 is given to illustrate its convergence.

    Theorem 2.1. Assuming that the linear system (1.1) is consistent, for a fixed partition V={V1,V2,,Vt} of [m], we assume that σ2max(AVi)β for each ViV={V1,V2,,Vt}, starting from any initial vector x0R(AT), the iteration sequence {xk}k=0 generated by the MRBK method, converges to the unique least-norm solution x=Ab. Moreover, for any k0, we have

    x1x22(1σ2min(A)βt)x0x22 (2.1)

    and

    xk+1x22(1σ2min(A)β(t1))k(1σ2min(A)βt)x0x22. (2.2)

    Proof. From the definition of the MRBK method, for a partition V={V1,V2,,Vt}, k=0,1,2, and ik{1,2,,t}, we have

    xk+1x=xkx+AVik(bVikAVikxk).

    Since AVikx=bVik, it holds that

    xk+1x=xkxAVikAVik(xkx).

    Since AVikAVik is an orthogonal projector, we can derive the following relation using the Pythagorean Theorem:

    xk+1x22=xkx22AVikAVik(xkx)22. (2.3)

    We note that

    AVikAVik(xkx)22σ2min(AVik)AVik(xkx)22=1σ2max(AVik)AVik(xkx)221βAVik(xkx)22=1βbVikAVikxk22=1βmax1itbViAVixk22. (2.4)

    Furthermore, from Method 1, we have

    bVikAVikxk+1=bVikAVik(xk+AVik(bViAVixk))=bVikAVikxkAVikAVik(bVikAVikxk)=bVikAVikxkAVikAVikbVik+AVikAVikAVikxk=bVikAVikAVikbVikAVikxk+AVikxk=AVikxAVikAVikAVikx=AVikxAVikx=0.

    Thus we obtain, for k=1,2,, that

    bAxk22=VikVVik1bVikAVikxk22(t1)max1itbViAVixk22,

    and hence

    max1itbViAVixk221t1bAxk22. (2.5)

    Combining (2.3), (2.4), and (2.5), for k=1,2,, we finally have

    xk+1x22xkx221βbAxk22t1=xkx221βA(xkx)22t1xkx22σ2min(A)β(t1)xkx22=(1σ2min(A)β(t1))xkx22.

    In particular, when k=0, we also obtain

    bAx022tmax1itbViAVix022.

    Thus, we have

    x1x22(1σ2min(A)βt)x0x22.

    Subsequently, we aim to conduct a more in-depth analysis of the convergence factors associated with the MRBK, RBK, and GRBK methods. The convergence factors of these three methods are denoted as ρMRBK, ρRBK, and ρGRBK, respectively. Based on Theorem 2.1, we derive

    ρMRBK=1σ2min(A)β(t1), (2.6)

    and from Theorem 3.1 in [34], we can get

    ρGRBK=1ζ2(A2FA2F+ζ+1)σ2min(A)βA2F, (2.7)

    where ζ=minViVAVi2F.

    Assuming that the matrix A is a standardized matrix, we can obtain from Theorem 2.1 in [33] that

    ρRBK=1σ2min(A)βm.

    We observe that ρMRBK<ρRBK as long as t1<m. In order to compare ρMRBK and ρGRBK, we consider rewriting ρMRBK:

    ρMRBK=1σ2min(A)β(t1)=1A2Ft1σ2min(A)βA2F. (2.8)

    For the row paving {AV1,AV2,,AVt} of standardized matrix A, we assume that every cardinality of AVi is equal, i.e., equal to mt, where means to round down to an integer. Thus

    A2Ft1=mt1, (2.9)

    and

    ζ2(A2FA2F+ζ+1)=mt2(tt+1+1)m2t(tt+1+1)=12(mt+1+mt). (2.10)

    Substitute (2.9) into (2.8), and (2.10) into (2.7), and it can be concluded that ρMRBK<ρGRBK.

    Remark 2.1. Based on the preceding analysis, it is evident that when the dimensions of matrix A are substantial, the convergence factors of both the MRBK and GRBK methods exhibit a close resemblance, indicating that their iterative steps are similarly aligned. However, owing to the lower computational cost associated with selecting working row block AVik in the MRBK method, it demonstrates a faster convergence rate in practical applications. The numerical experiments in Section 4 verify our inference.

    In this section, we consider further improvements to the MRBK method. In step 4 of Method 1, each update of xk requires the pseudo-inverse of AVik to be applied to the vector, which is computationally expensive when the size of matrix A is very large. We develop the maximum residual average block Kaczmarz (MRABK) method by projecting xk onto each row of AVik and averaging them with different extrapolation steps, avoiding the computation of the pseudo-inverse and greatly saving the computational cost at each iteration, as shown in Method 2.

    Method 2 The MRABK Method
    Input: A, b, , ω(0,2) and x0.
    Output: x
    1: Let {V1,V2,,Vt} be a partition of [m]
    2: for k=0,1,,1 do
    3: Select ik=argmax1itbViAVixk22
    4: Compute αk=ωbVikAVikxk22AVik2FATVik(bVikAVikxk)22
    5: Set xk+1=xk+αkATVik(bVikAVikxk)AVik2F
    6: end for

    For the MRABK method, Theorem 3.1 is given to illustrate its convergence.

    Theorem 3.1. Assuming that the linear system (1.1) is consistent, for a fixed partition V={V1,V2,,Vt} of [m] and ω(0,2), we assume that σ2max(AVi)β for each ViV={V1,V2,,Vt}, starting from any initial vector x0R(AT), the iteration sequence {xk}k=0 generated by the MRABK method, converges to the unique least-norm solution x=Ab. Moreover, for any k0, we have

    x1x22(1(2ωω2)σ2min(A)βt)x0x22 (3.1)

    and

    xk+1x22(1(2ωω2)σ2min(A)β(t1))k(1(2ωω2)σ2min(A)βt)x0x22. (3.2)

    Proof. From step 5 of Method 2 and AVikx=bVik, we can get

    xk+1x=xkx+αkATVik(bVikAVikxk)AVik2F=xkxαkATVikAVik(xkx)AVik2F=(IαkATVikAVikAVik2F)(xkx).

    By squaring the Euclidean norm on both sides of the above equation, it holds that

    xk+1x22=(IαkATVikAVikAVik2F)(xkx)22=xkx222αkAVik(xkx)22AVik2F+α2kATVikAVik(xkx)22AVik4F.

    Substituting αk into this equality, we have

    xk+1x22=xkx22(2ωω2)bVikAVikxk42ATVik(bVikAVikxk)22xkx22(2ωω2)bVikAVikxk42σ2max(AVik)(bVikAVikxk)22xkx22(2ωω2)bVikAVikxk22β=xkx22(2ωω2)max1itbViAVixk22β.

    We see that the first of these inequalities is true since 2ωω2>0 for ω(0,2), and

    ATVik(bVikAVikxk)22σ2max(AVik)(bVikAVikxk)22.

    From (2.5) in the proof of Theorem 2.1, we can obtain, for k=1,2,, that

    xk+1x22xkx22(2ωω2)bAxk22β(t1)xkx22(2ωω2)σ2min(A)(xkx)22β(t1)=(1(2ωω2)σ2min(A)β(t1))xkx22.

    In particular, when k=0, we have

    x1x22(1(2ωω2)σ2min(A)βt)x0x22.

    From (3.2), we can obtain

    ρMRABK=1(2ωω2)σ2min(A)β(t1), (3.3)

    and it is not difficult to find that when ω=1, ρMRABK reaches its minimum value, which is 1σ2min(A)β(t1), and at the same time, ρMRABK=ρMRBK. However, as can be seen from steps 4 and 5 of Method 2, the average block method is adopted in the MRABK method, which avoids the pseudo-inverse calculation in the xk update process in step 4 of the MRBK method. Therefore, the convergence speed of the MRABK method may be faster than that of the MRBK method. The numerical experiment results in Section 4 verify our conclusions well in the convergence rate.

    In this section, the efficiency of the MRBK and MRABK methods is verified through numerical experiments. We test and compare our proposed two methods with the GRK, MRK, RBK, RABK [35], and GRBK methods. In each iteration of the RBK, GRBK, and MRBK methods, we utilize CGLS [44] to solve linear problems. Additionally, we use the same randomized row partition {AV1,AV2,,AVt} defined as (1.2) for the RBK, RABK, GRBK, MRBK, and MRABK methods, and for the selection of the number of blocks, [36] proves that A22 is a good choice, so we set the number of blocks t=A22 uniformly, where means to round up to an integer. Specifically, in the case of the MRABK method, ω is set to 1 to ensure that the convergence factor of MRABK is minimized. Similarly, in the RABK method, ω is also set to 1. The effectiveness of the aforementioned method is assessed based on the quantity of iterative steps (referred to as "IT") and the computing time taken in seconds (referred to as "CPU"). Both IT and CPU indicate the arithmetic mean of the iterative steps and CPU duration needed to perform the process 20 times for every method. To illustrate the effectiveness of our proposed methods, we also compute the speed-up values of the MRBK method in comparison to both the MRK and GRBK methods, as well as the speed-up value of the MRABK method relative to the MRBK method. The definitions for these speed-up values are provided below:

    SU1=CPU of MRKCPU of MRBK,
    SU2=CPU of GRBKCPU of MRBK,
    SU3=CPU of MRBKCPU of MRABK.

    All experiments are conducted on a personal computer equipped with MATLAB (version R2022a), featuring an AMD Ryzen 7 8845H w/ Radeon(TM) 780M Graphics CPU clocked at 3.80 GHz, alongside 32.00 GB of RAM and running on the Windows 11 operating system.

    For consistent linear systems (1.1), the coefficient matrix is either given by the MATLAB function sprandn or taken from the SuiteSparse Matrix Collection [45]. For the coefficient matrix A being tested, any zero row vectors are removed and A is normalized to the standard matrix. The right vector is set to b=Ax, where vector x is the solution vector generated using the MATLAB function randn. All calculations start with the initial zero vector x0=0 and stop once the relative solution error (RSE) at the current iteration meets the criterion of RSE<106 or when IT surpasses the set limit of 200, 000, defined as

    RSE=xkx22x22,

    where the minimum norm solution x is obtained using the MATLAB function lsqminnorm.

    Define the density of a matrix as

    density=number of nonzeros of an m -by- n matrixmn.

    For the first kind of sparse matrix A, we set its sparse parameter is 0.01, i.e.,  A = sprandn (m, n, 0.01), in Tables 1 and 2, the IT and CPU for the GRK, MRK, RBK, RABK, GRBK, MRBK, and MRABK methods are listed, and we also list the speed-up values for several methods.

    Table 1.  Numerical results for m-by-n random matrices A with m=6000 and different n.
    m×n 6000×1000 6000×1500 6000×2000 6000×2500 6000×3000
    A22 12.29 9.13 7.64 6.63 5.86
    GRK IT 2325.8 5217.6 10418.6 18855.0 34941.8
    CPU 0.5867 1.8596 4.9190 12.7430 28.2539
    MRK IT 2230.0 5115.0 10297.0 18647.0 34598.0
    CPU 0.1538 0.5342 1.4035 5.2019 11.9263
    RBK IT 31.4 41.4 55.6 81.6 113.8
    CPU 0.0339 0.1514 0.1013 0.1851 0.2908
    RABK IT 55.6 69.2 87.8 123.4 177.4
    CPU 0.0118 0.0191 0.0332 0.0540 0.0912
    GRBK IT 25.2 29.8 37.0 52.4 72.8
    CPU 0.0486 0.0773 0.1340 0.2639 0.3927
    MRBK IT 22.0 29.0 36.0 51.0 69.0
    CPU 0.0239 0.0319 0.0590 0.1035 0.1391
    MRABK IT 40.0 50.0 62.0 79.0 104.0
    CPU 0.0085 0.0160 0.0260 0.0410 0.0599
    SU1 6.44 16.75 23.79 50.28 85.74
    SU2 2.04 2.42 2.27 2.55 2.82
    SU3 2.82 1.99 2.27 2.52 2.32

     | Show Table
    DownLoad: CSV
    Table 2.  Numerical results for m-by-n random matrices A with n=6000 and different m.
    m×n 1000×6000 1500×6000 2000×6000 2500×6000 3000×6000
    A22 2.01 2.24 2.50 2.69 2.93
    GRK IT 4057.8 8544.0 14833.6 27124.0 43610.2
    CPU 2.4823 6.4250 13.0586 27.2911 48.0314
    MRK IT 4051.0 8324.0 14761.0 27112.0 42903.0
    CPU 1.5622 3.6929 7.3926 14.9410 24.4223
    RBK IT 19.2 43.4 58.2 81.2 113.8
    CPU 0.0396 0.0954 0.1552 0.2670 0.2885
    RABK IT 26.2 59.4 58.6 102.2 140.0
    CPU 0.0120 0.0322 0.0452 0.0713 0.1010
    GRBK IT 10.0 22.0 29.4 40.6 55.6
    CPU 0.0303 0.0779 0.1245 0.2487 0.3010
    MRBK IT 10.0 21.0 28.0 40.0 55.0
    CPU 0.0207 0.0482 0.0748 0.1277 0.1571
    MRABK IT 19.0 32.0 40.0 55.0 73.0
    CPU 0.0104 0.0198 0.0281 0.0441 0.0638
    SU1 75.35 76.58 98.88 116.96 155.46
    SU2 1.46 1.62 1.67 1.95 1.92
    SU3 1.99 2.44 2.66 2.89 2.46

     | Show Table
    DownLoad: CSV

    From Table 1, we can find that when m=6000 and n=1000, 1500, 2000, 2500, or 3000, both the MRBK method and MRABK method proposed by us outperform other methods in terms of CPU time. Now we focus on the MRK, GRBK, MRBK, and MRABK methods. We observe that the MRBK method, as a block-improved version of the MRK method, is significantly more efficient in terms of IT and CPU time. The SU1 value of these two methods is at least 6.44 and the maximal is 85.74. We analyzed the convergence factors of the GRBK and MRBK methods in Section 2, and concluded that their IT should be very close when the size of A is large. From Table 1, we can observe that the IT of these two methods is very close, and the IT of the MRBK method is slightly smaller than that of the GRBK method. The SU2 value is at least 2.04 and at most 2.82. The MRABK method has the shortest CPU time among all the above methods, and its SU3 value relative to the MRBK method is at least 1.99 and up to 2.82.

    From Table 2, we can find that when n=6000 and m=1000, 1500, 2000, 2500, or 3000, the MRBK and MRABK methods are still superior to other methods. Among all the above methods, the MRBK method has the fewest IT, while the MRABK method has the shortest CPU time. We note that under these conditions, the SU1 value is at least 75.35 and the maximum is 155.46. Even though the IT of the MRBK and GRBK methods are almost the same, the CPU time of the MRBK method is better than that of the GRBK method. Compared with the MRBK method, the SU3 value of the MRABK method is at least 1.99 and at most 2.89.

    The second type of coefficient matrices selected from the SuiteSparse Matrix Collection [45] are derived from different applications, such as the combinatorial problem and linear programming problem. These matrices possess some unique structures and characteristics, such as being thin (m>n) (e.g., Franz9, GL7d12), fat (m<n) (e.g., p6000, lp_80bau3b), or square (m=n) (e.g., Trefethen_700), and we list details about them in Table 3, including their size, density, and cond(A). For these matrices, we implement the GRK, MRK, RBK, GRBK, RABK, MRBK, and MRABK methods, and list the IT and CPU time for each method in Table 3.

    Table 3.  Numerical results for matrices A from the SuiteSparse Matrix Collection.
    name Franz9 GL7d12 Trefethen_700 p6000 lp_80bau3b
    m×n 19588×4164 8899×1019 700×700 2095×7967 2262×12061
    density 0.12% 0.41% 2.58% 0.12% 0.09%
    cond(A) 1.46e+16 Inf 4.71e+03 6.50e+05 567.23
    A22 60.80 55.75 2.54 2.35 2.82
    GRK IT 9056.8 2394.0 1555.2 4676.8 15874.6
    CPU 5.5755 0.6003 0.0969 1.2815 6.3763
    MRK IT 2307.0 2492.0 1536.0 4387.0 15924.0
    CPU 1.6704 0.1403 0.0326 0.8933 4.2627
    RBK IT 382.6 722.6 37.2 34.6 77.8
    CPU 0.6476 0.5484 0.0180 0.0292 0.2153
    RABK IT 1351.6 877.8 221.6 35.6 202.0
    CPU 0.1695 0.1576 0.0132 0.0071 0.0467
    GRBK IT 124.8 145.3 10.0 18.0 35.0
    CPU 0.5942 0.2011 0.0055 0.0193 0.1097
    MRBK IT 129.0 121.0 10.0 18.0 34.0
    CPU 0.3076 0.0879 0.0044 0.0148 0.0921
    MRABK IT 545.0 176.0 75.0 26.0 126.0
    CPU 0.1662 0.0264 0.0040 0.0052 0.0301
    SU1 5.43 1.60 7.38 60.55 46.29
    SU2 1.93 2.29 1.25 1.31 1.19
    SU3 1.85 3.33 1.09 2.83 3.06

     | Show Table
    DownLoad: CSV

    In Table 3, we observe that the MRABK method still maintains the shortest CPU time. When the coefficient matrix is GL7d12 and lp_80bau3b, the MRBK method has the smallest IT. When the coefficient matrix is Trefethen_700 or p6000, the IT is the same as that of the GRBK method. Compared to the MRK method, the MRBK method exhibits a minimal SU1 value of 1.60 and a maximal SU1 value of 60.55. In comparison to the GRBK method, the MRBK method demonstrates a minimal SU2 value of 1.19 and a maximal SU2 value of 2.29. Furthermore, compared to the MRBK method, the MRABK method showcases a minimal SU3 value of 1.09, reaching a maximal SU3 value of 3.33.

    To further assess the performance of various block Kaczmarz methods, Figures 1 and 2 display the relationship between the RSE and the IT, as well as the relationship between the RSE and CPU time for different block Kaczmarz methods. As we can see from Figures 1 and 2, with the increase of the IT, the RSE of the MRBK and MRABK methods both decrease faster than the RBK and RABK methods, and the MRBK method has the fastest decline in RSE among all block Kaczmarz methods. With the increase of the CPU time, the RSE of the MRABK method decreases the fastest, followed by the RABK and MRBK methods. Both of the MRBK and MRABK methods are ahead of the RBK and GRBK methods.

    Figure 1.  (a) RSE versus IT and (b) RSE versus CPU for different block Kaczmrarz methods when the coefficient matrices are the 6000×3000 matrix in Table 1.
    Figure 2.  (a) RSE versus IT and (b) RSE versus CPU for different block Kaczmrarz methods when the coefficient matrices are the 3000×6000 matrix in Table 2.

    This paper presents two new Kaczmarz (MRBK and MRABK) methods for solving consistent linear systems. Both methods utilize uniform randomized partition of the rows of matrix A to construct the row subsets {AV1,AV2,,AVt}. In each iteration, the MRBK method updates the solution vector by projecting x onto the hyperplane defined by AVikx=bVik, where Vik is selected according to ik=argmax1itbViAVixk22, ensuring that the block with the largest residual is eliminated first, leading to rapid convergence. Building upon the MRBK method, the MRABK method introduces an adaptive step size, eliminating the need for calculating the pseudo-inverse of the row subset AVik of A during the x update process, thereby enhancing the convergence speed. We provide a comprehensive analysis of the convergence theory for these two methods and conduct numerical experiments to validate their effectiveness. Both the theoretical analysis and numerical results demonstrate the superiority of our proposed methods over other Kaczmarz methods, including the GRK, MRK, RBK, and GRBK methods. In addition, we have realized that some valuable topics deserve further study, such as fully considering the structure and properties of A, making more efficient row partition of A, and finding a better step size for the MRABK method.

    Wen-Ning Sun: Conceptualization, methodology, method design and implementation, numerical experiments and visualization, writing original drafts, reviewing and editing drafts; Mei Qin: Methodology, reviewing, editing drafts. All authors have reviewed and approved the final version of the manuscript prior to its publication.

    The authors declare no competing interests.



    [1] L.-P. Sun, Y.-M. Wei, J.-Y. Zhou, On an iterative method for solving the least squares problem of rank-deficient systems, Int. J. Comput. Math., 92 (2014), 532–541. https://doi.org/10.1080/00207160.2014.900173 doi: 10.1080/00207160.2014.900173
    [2] Z.-Z. Bai, C.-H. Jin, Column-decomposed relaxation methods for the overdetermined systems of linear equations, Int. J. Appl. Math., 13 (2003), 71–82.
    [3] S. Kaczmarz, Angenäherte Auflösung von systemen linearer gleichungen, Bull. Int. Acad. Polon. Sci. Lett., 35 (1937), 355–357.
    [4] M. A. Brooks, A survey of algebraic algorithms in computerized tomography, Oshawa: University of Ontario Institute of Technology, 2010.
    [5] G. N. Hounsfield, Computerized transverse axial scanning (tomography): Part 1. Description of system, Brit. J. Radiol., 46 (1973), 1016–1022. https://doi.org/10.1259/0007-1285-46-552-1016 doi: 10.1259/0007-1285-46-552-1016
    [6] G. T. Herman, Fundamentals of computerized tomography: Image reconstruction from projections, 2 Eds., London: Springer, 2009. https://doi.org/10.1007/978-1-84628-723-7
    [7] F. Natterer, The mathematics of computerized tomography, Philadelphia: SIAM Publications, 2001. https://doi.org/10.1137/1.9780898719284
    [8] G. T. Herman, R. Davidi, Image reconstruction from a small number of projections, Inverse Probl., 24 (2008), 045011. https://doi.org/10.1088/0266-5611/24/4/045011 doi: 10.1088/0266-5611/24/4/045011
    [9] R. Gordon, R. Bender, G. T. Herman, Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and X-ray photography, J. Theor. Biol., 29 (1970), 471–481. https://doi.org/10.1016/0022-5193(70)90109-8 doi: 10.1016/0022-5193(70)90109-8
    [10] G. T. Herman, L. B. Meyer, Algebraic reconstruction techniques can be made computationally efficient (positron emission tomography application), IEEE. Trans. Med. Imaging, 12 (1993), 600–609. https://doi.org/10.1109/42.241889 doi: 10.1109/42.241889
    [11] C. Byrne, A unified treatment of some iterative algorithms in signal processing and image reconstruction, Inverse Probl., 20 (2004), 103. https://doi.org/10.1088/0266-5611/20/1/006 doi: 10.1088/0266-5611/20/1/006
    [12] D. A. Lorenz, S. Wenger, F. Schöpfer, M. Magnor, A sparse Kaczmarz solver and a linearized Bregman method for online compressed sensing, In: 2014 IEEE International conference on image processing, 2014, 1347–1351. https://doi.org/10.1109/ICIP.2014.7025269
    [13] F. Pasqualetti, R. Carli, F. Bullo, Distributed estimation via iterative projections with application to power network monitoring, Automatica, 48 (2012), 747–758. https://doi.org/10.1016/j.automatica.2012.02.025 doi: 10.1016/j.automatica.2012.02.025
    [14] J. M. Elble, N. V. Sahinidis, P. Vouzis, GPU computing with Kaczmarz's and other iterative algorithms for linear systems, Parallel Comput., 36 (2010), 215–231. https://doi.org/10.1016/j.parco.2009.12.003 doi: 10.1016/j.parco.2009.12.003
    [15] A. Galántai, Projectors and projection methods, New York: Springer, 2004. https://doi.org/10.1007/978-1-4419-9180-5
    [16] P. A. Knight, Error analysis of stationary iteration and associated problems, Manchester: University of Manchester, 1993.
    [17] Z.-Z. Bai, X.-G. Liu, On the Meany inequality with applications to convergence analysis of several row-action iteration methods, Numer. Math., 124 (2013), 215–236. https://doi.org/10.1007/s00211-012-0512-6 doi: 10.1007/s00211-012-0512-6
    [18] A. Ma, D. Needell, A. Ramdas, Convergence properties of the randomized extended Gauss-Seidel and Kaczmarz methods, SIAM J. Matrix Anal. Appl., 36 (2015), 1590–1604. https://doi.org/10.1137/15M1014425 doi: 10.1137/15M1014425
    [19] L. Dai, T. B. Schön, On the exponential convergence of the Kaczmarz algorithm, IEEE Signal Process. Lett., 22 (2015), 1571–1574. https://doi.org/10.1109/LSP.2015.2412253 doi: 10.1109/LSP.2015.2412253
    [20] T. Strohmer, R. Vershynin, A randomized Kaczmarz algorithm with exponential convergence, J. Fourier Anal. Appl., 15 (2009), 262–278. https://doi.org/10.1007/s00041-008-9030-4 doi: 10.1007/s00041-008-9030-4
    [21] Z.-Z. Bai, W.-T. Wu, On greedy randomized Kaczmarz method for solving large sparse linear systems, SIAM J. Sci. Comput., 40 (2018), A592–A606. https://doi.org/10.1137/17M1137747 doi: 10.1137/17M1137747
    [22] R. Ansorge, Connections between the Cimmino-method and the Kaczmarz-method for the solution of singular and regular systems of equations, Computing, 33 (1984), 367–375. https://doi.org/10.1007/bf02242280 doi: 10.1007/bf02242280
    [23] C. Popa, Convergence rates for Kaczmarz-type algorithms, Numer. Algor., 79 (2018), 1–17. https://doi.org/10.1007/s11075-017-0425-7 doi: 10.1007/s11075-017-0425-7
    [24] Z.-Z. Bai, W.-T. Wu, On relaxed greedy randomized Kaczmarz methods for solving large sparse linear systems, Appl. Math. Lett., 83 (2018), 21–26. https://doi.org/10.1016/j.aml.2018.03.008 doi: 10.1016/j.aml.2018.03.008
    [25] J.-J. Zhang, A new greedy Kaczmarz algorithm for the solution of very large linear systems, Appl. Math. Lett., 91 (2019), 207–212. https://doi.org/10.1016/j.aml.2018.12.022 doi: 10.1016/j.aml.2018.12.022
    [26] Z.-Z. Bai, W.-T. Wu, On convergence rate of the randomized Kaczmarz method, Linear Algebra Appl., 553 (2018), 252–269. https://doi.org/10.1016/j.laa.2018.05.009 doi: 10.1016/j.laa.2018.05.009
    [27] Y. Jiang, G. Wu, L. Jiang, A semi-randomized Kaczmarz method with simple random sampling for large-scale linear systems, Adv. Comput. Math., 49 (2023), 20. https://doi.org/10.1007/s10444-023-10018-2 doi: 10.1007/s10444-023-10018-2
    [28] Y. Zeng, D. Han, Y. Su, J. Xie, Randomized Kaczmarz method with adaptive stepsizes for inconsistent linear systems, Numer. Algor., 94 (2023), 1403–1420. https://doi.org/10.1007/s11075-023-01540-x doi: 10.1007/s11075-023-01540-x
    [29] Z.-Z. Bai, W.-T. Wu, Randomized Kaczmarz iteration methods: Algorithmic extensions and convergence theory, Japan J. Indust. Appl. Math., 40 (2023), 1421–1443. https://doi.org/10.1007/s13160-023-00586-7 doi: 10.1007/s13160-023-00586-7
    [30] Z.-Z. Bai, L. Wang, On convergence rates of Kaczmarz-type methods with di erent selection rules of working rows, Appl. Numer. Math., 186 (2023), 289–319. https://doi.org/10.1016/j.apnum.2023.01.013 doi: 10.1016/j.apnum.2023.01.013
    [31] X.-Z. Wang, M.-L. Che, Y.-M. Wei, Randomized Kaczmarz methods for tensor complementarity problems, Comput. Optim. Appl., 82 (2022), 595–615. https://doi.org/10.1007/s10589-022-00382-y doi: 10.1007/s10589-022-00382-y
    [32] D. Needell, J. A. Tropp, Paved with good intentions: Analysis of a randomized block Kaczmarz method, Linear Algebra Appl., 441 (2014), 199–221. https://doi.org/10.1016/j.laa.2012.12.022 doi: 10.1016/j.laa.2012.12.022
    [33] D. Needell, R. Zhao, A. Zouzias, Randomized block Kaczmarz method with projection for solving least squares, Linear Algebra Appl., 484 (2015), 322–343. https://doi.org/10.1016/j.laa.2015.06.027 doi: 10.1016/j.laa.2015.06.027
    [34] Y. Liu, C.-Q. Gu, On greedy randomized block Kaczmarz method for consistent linear systems, Linear Algebra Appl., 616 (2021), 178–200. https://doi.org/10.1016/j.laa.2021.01.024 doi: 10.1016/j.laa.2021.01.024
    [35] I. Necoara, Faster randomized block Kaczmarz algorithms, SIAM J. Matrix Anal. Appl., 40 (2019), 1425–1452. https://doi.org/10.1137/19M1251643 doi: 10.1137/19M1251643
    [36] C.-Q. Miao, W.-T. Wu, On greedy randomized average block Kaczmarz method for solving large linear systems, J. Comput. Appl. Math., 413 (2022), 114372. https://doi.org/10.1016/j.cam.2022.114372 doi: 10.1016/j.cam.2022.114372
    [37] W. Li, F. Yin, Y.-M. Liao, G.-X. Huang, A greedy average block Kaczmarz method for the large scaled consistent system of linear equations, AIMS Mathematics, 7 (2022), 6792–6806. https://doi.org/10.3934/math.2022378 doi: 10.3934/math.2022378
    [38] A.-Q. Xiao, J.-F. Yin, N. Zheng, On fast greedy block Kaczmarz methods for solving large consistent linear systems, Comput. Appl. Math., 42 (2023), 119. https://doi.org/10.1007/s40314-023-02232-x doi: 10.1007/s40314-023-02232-x
    [39] J. Briskman, D. Needell, Block Kaczmarz method with inequalities, J. Math. Imaging Vis., 52 (2015), 385–396. https://doi.org/10.1007/s10851-014-0539-7 doi: 10.1007/s10851-014-0539-7
    [40] Y. Zhang, H. Li, Block sampling Kaczmarz-Motzkin methods for consistent linear systems, Calcolo, 58 (2021), 39. https://doi.org/10.1007/s10092-021-00429-2 doi: 10.1007/s10092-021-00429-2
    [41] Y. Zhang, H. Li, Randomized block subsampling Kaczmarz-Motzkin method, Linear Algebra Appl., 667 (2023), 133–150. https://doi.org/10.1016/j.laa.2023.03.003 doi: 10.1016/j.laa.2023.03.003
    [42] R.-R. Li, H. Liu, On randomized partial block Kaczmarz method for solving huge linear algebraic systems, Comput. Appl. Math., 41 (2022), 278. https://doi.org/10.1007/s40314-022-01978-0 doi: 10.1007/s40314-022-01978-0
    [43] J.-Q. Chen, Z.-D. Huang, On a fast deterministic block Kaczmarz method for solving large-scale linear systems, Numer. Algor., 89 (2022), 1007–1029. https://doi.org/10.1007/s11075-021-01143-4 doi: 10.1007/s11075-021-01143-4
    [44] Å. Björck, Numerical methods for least squares problems, Philadelphia: SIAM Publications, 1996. https://doi.org/10.1137/1.9781611971484
    [45] S. P. Kolodziej, M. Aznaveh, M. Bullock, J. David, T. A. Davis, M. Henderson, et al., The SuiteSparse matrix collection website interface, J. Open Source Softw., 4 (2019), 1244. https://doi.org/10.21105/joss.01244 doi: 10.21105/joss.01244
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(677) PDF downloads(66) Cited by(0)

Figures and Tables

Figures(2)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog