Loading [MathJax]/jax/output/SVG/jax.js
Research article

An online conjugate gradient algorithm for large-scale data analysis in machine learning

  • In recent years, the amount of available data is growing exponentially, and large-scale data is becoming ubiquitous. Machine learning is a key to deriving insight from this deluge of data. In this paper, we focus on the large-scale data analysis, especially classification data, and propose an online conjugate gradient (CG) descent algorithm. Our algorithm draws from a recent improved Fletcher-Reeves (IFR) CG method proposed in Jiang and Jian[13] as well as a recent approach to reduce variance for stochastic gradient descent from Johnson and Zhang [15]. In theory, we prove that the proposed online algorithm achieves a linear convergence rate under strong Wolfe line search when the objective function is smooth and strongly convex. Comparison results on several benchmark classification datasets demonstrate that our approach is promising in solving large-scale machine learning problems, viewed from the points of area under curve (AUC) value and convergence behavior.

    Citation: Wei Xue, Pengcheng Wan, Qiao Li, Ping Zhong, Gaohang Yu, Tao Tao. An online conjugate gradient algorithm for large-scale data analysis in machine learning[J]. AIMS Mathematics, 2021, 6(2): 1515-1537. doi: 10.3934/math.2021092

    Related Papers:

    [1] Meshari Alesemi . Innovative approaches of a time-fractional system of Boussinesq equations within a Mohand transform. AIMS Mathematics, 2024, 9(10): 29269-29295. doi: 10.3934/math.20241419
    [2] Azzh Saad Alshehry, Humaira Yasmin, Ali M. Mahnashi . Analyzing fractional PDE system with the Caputo operator and Mohand transform techniques. AIMS Mathematics, 2024, 9(11): 32157-32181. doi: 10.3934/math.20241544
    [3] Aisha Abdullah Alderremy, Rasool Shah, Nehad Ali Shah, Shaban Aly, Kamsing Nonlaopon . Comparison of two modified analytical approaches for the systems of time fractional partial differential equations. AIMS Mathematics, 2023, 8(3): 7142-7162. doi: 10.3934/math.2023360
    [4] Musawa Yahya Almusawa, Hassan Almusawa . Numerical analysis of the fractional nonlinear waves of fifth-order KdV and Kawahara equations under Caputo operator. AIMS Mathematics, 2024, 9(11): 31898-31925. doi: 10.3934/math.20241533
    [5] M. Mossa Al-Sawalha, Khalil Hadi Hakami, Mohammad Alqudah, Qasem M. Tawhari, Hussain Gissy . Novel Laplace-integrated least square methods for solving the fractional nonlinear damped Burgers' equation. AIMS Mathematics, 2025, 10(3): 7099-7126. doi: 10.3934/math.2025324
    [6] Aslı Alkan, Halil Anaç . The novel numerical solutions for time-fractional Fornberg-Whitham equation by using fractional natural transform decomposition method. AIMS Mathematics, 2024, 9(9): 25333-25359. doi: 10.3934/math.20241237
    [7] Reetika Chawla, Komal Deswal, Devendra Kumar, Dumitru Baleanu . A novel finite difference based numerical approach for Modified Atangana- Baleanu Caputo derivative. AIMS Mathematics, 2022, 7(9): 17252-17268. doi: 10.3934/math.2022950
    [8] Emad Salah, Ahmad Qazza, Rania Saadeh, Ahmad El-Ajou . A hybrid analytical technique for solving multi-dimensional time-fractional Navier-Stokes system. AIMS Mathematics, 2023, 8(1): 1713-1736. doi: 10.3934/math.2023088
    [9] Qasem M. Tawhari . Advanced analytical techniques for fractional Schrödinger and Korteweg-de Vries equations. AIMS Mathematics, 2025, 10(5): 11708-11731. doi: 10.3934/math.2025530
    [10] Ritu Agarwal, Mahaveer Prasad Yadav, Dumitru Baleanu, S. D. Purohit . Existence and uniqueness of miscible flow equation through porous media with a non singular fractional derivative. AIMS Mathematics, 2020, 5(2): 1062-1073. doi: 10.3934/math.2020074
  • In recent years, the amount of available data is growing exponentially, and large-scale data is becoming ubiquitous. Machine learning is a key to deriving insight from this deluge of data. In this paper, we focus on the large-scale data analysis, especially classification data, and propose an online conjugate gradient (CG) descent algorithm. Our algorithm draws from a recent improved Fletcher-Reeves (IFR) CG method proposed in Jiang and Jian[13] as well as a recent approach to reduce variance for stochastic gradient descent from Johnson and Zhang [15]. In theory, we prove that the proposed online algorithm achieves a linear convergence rate under strong Wolfe line search when the objective function is smooth and strongly convex. Comparison results on several benchmark classification datasets demonstrate that our approach is promising in solving large-scale machine learning problems, viewed from the points of area under curve (AUC) value and convergence behavior.


    The purpose of this paper is to study the global behavior of the following max-type system of difference equations of the second order with four variables and period-two parameters

    {xn=max{An,zn1yn2},yn=max{Bn,wn1xn2},zn=max{Cn,xn1wn2},wn=max{Dn,yn1zn2},  nN0{0,1,2,}, (1.1)

    where An,Bn,Cn,DnR+(0,+) are periodic sequences with period 2 and the initial values xi,yi,zi,wiR+ (1i2). To do this we will use some methods and ideas which stems from [1,2]. For a more complex variant of the method, see [3]. A solution {(xn,yn,zn,wn)}+n=2 of (1.1) is called an eventually periodic solution with period T if there exists mN such that (xn,yn,zn,wn)=(xn+T,yn+T,zn+T,wn+T) holds for all nm.

    When xn=yn and zn=wn and A0=A1=B0=B1=α and C0=C1=D0=D1=β, (1.1) reduces to following max-type system of difference equations

    {xn=max{α,zn1xn2},zn=max{β,xn1zn2},  nN0. (1.2)

    Fotiades and Papaschinopoulos in [4] investigated the global behavior of (1.2) and showed that every positive solution of (1.2) is eventually periodic.

    When xn=zn and yn=wn and An=Cn and Bn=Dn, (1.1) reduces to following max-type system of difference equations

    {xn=max{An,yn1xn2},yn=max{Bn,xn1yn2},  nN0. (1.3)

    Su et al. in [5] investigated the periodicity of (1.3) and showed that every solution of (1.3) is eventually periodic.

    In 2020, Su et al. [6] studied the global behavior of positive solutions of the following max-type system of difference equations

    {xn=max{A,yntxns},yn=max{B,xntyns},  nN0,

    where A,BR+.

    In 2015, Yazlik et al. [7] studied the periodicity of positive solutions of the max-type system of difference equations

    {xn=max{1xn1,min{1,pyn1}},yn=max{1yn1,min{1,pxn1}}, nN0, (1.4)

    where pR+ and obtained in an elegant way the general solution of (1.4).

    In 2016, Sun and Xi [8], inspired by the research in [5], studied the following more general system

    {xn=max{1xnm,min{1,pynr}},yn=max{1ynm,min{1,qxnt}},  nN0, (1.5)

    where p,qR+, m,r,tN{1,2,} and the initial conditions xi,yiR+ (1is) with s=max{m,r,t} and showed that every positive solution of (1.5) is eventually periodic with period 2m.

    In [9], Stević studied the boundedness character and global attractivity of the following symmetric max-type system of difference equations

    {xn=max{B,ypn1xpn2},yn=max{B,xpn1ypn2},  nN0,

    where B,pR+ and the initial conditions xi,yiR+ (1i2).

    In 2014, motivated by results in [9], Stević [10] further study the behavior of the following max-type system of difference equations

    {xn=max{B,ypn1zpn2},yn=max{B,zpn1xpn2},zn=max{B,xpn1ypn2}.  nN0, (1.6)

    where B,pR+ and the initial conditions xi,yi,ziR+ (1i2), and showed that system (1.6) is permanent when p(0,4).

    For more many results for global behavior, eventual periodicity and the boundedness character of positive solutions of max-type difference equations and systems, please readers refer to [11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30] and the related references therein.

    In this section, we study the global behavior of system (1.1). For any n1, write

    {x2n=A2nXn,y2n=B2nYn,z2n=C2nZn,w2n=D2nWn,x2n+1=A2n+1Xn,y2n+1=B2n+1Yn,z2n+1=C2n+1Zn,w2n+1=D2n+1Wn.

    Then, (1.1) reduces to the following system

    {Xn=max{1,C2n1Zn1A2nB2nYn1},Yn=max{1,D2n1Wn1B2nA2nXn1},Zn=max{1,A2nXnC2n+1D2n+1Wn1},Wn=max{1,B2nYnD2n+1C2n+1Zn1},Zn=max{1,A2n1Xn1C2nD2nWn1},Wn=max{1,B2n1Yn1D2nC2nZn1},Xn=max{1,C2nZnA2n+1B2n+1Yn1},Yn=max{1,D2nWnB2n+1A2n+1Xn1},  nN0. (2.1)

    From (2.1) we see that it suffices to consider the global behavior of positive solutions of the following system

    {un=max{1,bvn1aAUn1},Un=max{1,BVn1aAun1},vn=max{1,aunbBVn1},Vn=max{1,AUnbBvn1},  nN0, (2.2)

    where a,b,A,BR+, the initial conditions u1,U1,v1,V1R+. If (un,Un,vn,Vn,a,A,b,B)=(Xn,Yn,Zn,Wn,A2n,B2n,C2n1,D2n1), then (2.2) is the first four equations of (2.1). If (un,Un,vn,Vn,a,A,b,B)=(Zn,Wn,Xn,Yn,C2n,D2n,A2n1,B2n1), then (2.2) is the next four equations of (2.1). In the following without loss of generality we assume aA and bB. Let {(un,Un,vn,Vn)}n=1 be a positive solution of (2.2).

    Proposition 2.1. If ab<1, then there exists a solution {(un,Un,vn,Vn)}n=1 of (2.2) such that un=vn=1 for any n1 and limnUn=limnVn=.

    Proof. Let u1=v1=1 and U1=V1=max{baA,aAB,abB}+1. Then, from (2.2) we have

    {u0=max{1,bv1aAU1}=1,U0=max{1,BV1aAu1}=BV1aA,v0=max{1,au0bBV1}=1,V0=max{1,AU0bBv1}=V1ab,

    and

    {u1=max{1,bv0aAU0}=max{1,bBV1}=1,U1=max{1,BV0aAu0}=max{1,BV1aAab}=BV1aAab,v1=max{1,au1bBV0}=max{1,aabbBV1}=1,V1=max{1,AU1bBv0}=max{1,V1(ab)2}=V1(ab)2.

    Suppose that for some kN, we have

    {uk=1,Uk=BV1aA(ab)k,vk=1,Vk=V1(ab)k+1.

    Then,

    {uk+1=max{1,bvkaAUk}=max{1,b(ab)kBV1}=1,Uk+1=max{1,BVkaAuk}=max{1,BV1aA(ab)k+1}=BV1aA(ab)k+1,vk+1=max{1,auk+1bBVk}=max{1,a(ab)k+1bBV1}=1,Vk+1=max{1,AUk+1bBvk}=max{1,V1(ab)k+2}=V1(ab)k+2.

    By mathematical induction, we can obtain the conclusion of Proposition 2.1. The proof is complete.

    Now, we assume that ab1. Then, from (2.2) it follows that

    {un=max{1,bvn1aAUn1},Un=max{1,BVn1aAun1},vn=max{1,abBVn1,vn1ABUn1Vn1},Vn=max{1,AbBvn1,Vn1abun1vn1},  nN0. (2.3)

    Lemma 2.1. The following statements hold:

    (1) For any nN0,

    un, Un, vn, Vn[1,+). (2.4)

    (2) If ab1, then for any kN and nk+2,

    {un=max{1,baAUn1,bvkaA(AB)nk1Un1Un2Vn2UkVk},Un=max{1,BaAun1,BVkaA(ab)nk1un1un2vn2ukvk},vn=max{1,abBVn1,vk(AB)nkUn1Vn1UkVk},Vn=max{1,AbBvn1,Vk(ab)nkun1vn1ukvk}. (2.5)

    (3) If ab1, then for any kN and nk+4,

    {1vnvn2,1VnAaVn2,1unmax{1,bBun2,bvkaA(AB)nk1},1Unmax{1,BbUn2,BVkaA(ab)nk1}. (2.6)

    Proof. (1) It follows from (2.2).

    (2) Since ABab1, it follows from (2.2) and (2.3) that for any kN and nk+2,

    un=max{1,bvn1aAUn1}=max{1,baAUn1max{1,abBVn2,vn2ABUn2Vn2}}=max{1,baAUn1,bvn2ABaAUn1Un2Vn2}=max{1,baAUn1,bABaAUn1Un2Vn2max{1,abBVn1,vn3ABUn3Vn3}}=max{1,baAUn1,bvn3(AB)2aAUn1Un2Vn2Un3Vn3}=max{1,baAUn1,bvkaA(AB)nk1Un1Un2Vn2UkVk}.

    In a similar way, also we can obtain the other three formulas.

    (3) By (2.5) one has that for any kN and nk+2,

    {unbaAUn1,UnBaAun1,vnabBVn1,VnAbBvn1,

    from which and (2.4) it follows that for any nk+4,

    {1unmax{1,bBun2,bvkaA(AB)nk1},1Unmax{1,BbUn2,BVkaA(ab)nk1},1vnmax{1,avn2A,vn2}=vn2,1Vnmax{1,AVn2a,Vn2}=AVn2a.

    The proof is complete.

    Proposition 2.2. If ab=AB=1, then {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    Proof. By the assumption we see a=A and b=B. By (2.5) we see that for any kN and nk+2,

    {un=max{1,b3Un1,b3vkUn1Un2Vn2UkVk},Un=max{1,b3un1,b3Vkun1un2vn2ukvk},vn=max{1,a3Vn1,vkUn1Vn1UkVk},Vn=max{1,a3vn1,Vkun1vn1ukvk}. (2.7)

    (1) If a=b=1, then it follows from (2.7) and (2.4) that for any nk+4,

    {un=max{1,vkUn1Un2Vn2UkVk}max{1,vkUn2Un3Vn3UkVk}=un1,Un=max{1,Vkun1un2vn2ukvk}Un1,vn=max{1,vkUn1Vn1UkVk}vn1,Vn=max{1,Vkun1vn1ukvk}Vn1. (2.8)

    We claim that vn=1 for any n6 or Vn=1 for any n6. Indeed, if vn>1 for some n6 and Vm>1 for some m6, then

    vn=v1Un1Vn1U1V1>1,  Vm=V1um1vm1u1v1>1,

    which implies

    1v1Un1Vn1U1V1V1um1vm1u1v1=Vmvn>1.

    A contradiction.

    If vn=1 for any n6, then by (2.8) we see un=1 for any n10, which implies Un=Vn=V10.

    If Vn=1 for any n6, then by (2.8) we see Un=1 for any n10, which implies vn=un=v10.

    Then, {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    (2) If a<1<b, then it follows from (2.7) that for any nk+4,

    {un=max{1,b3Un1,b3vkUn1Un2Vn2UkVk},Un=max{1,b3un1,b3Vkun1un2vn2ukvk},vn=max{1,vkUn1Vn1UkVk}vn1,Vn=max{1,Vkun1vn1ukvk}Vn1. (2.9)

    It is easy to verify vn=1 for any n6 or Vn=1 for any n6.

    If Vn=vn=1 eventually, then by (2.9) we have

    {1vkUn1Vn1UkVk eventually,1Vkun1vn1ukvk eventually.

    Since Unb3un1 and unb3Un1, we see

    {un=max{1,b3Un1,b3vkUn1Un2Vn2UkVk}=max{1,b3Un1}un2 eventually,Un=max{1,b3un1,b3Vkun1un2vn2ukvk}=max{1,b3un1}Un2 eventually,

    which implies

    {un2un=max{1,b3Un1}max{1,b3Un3}=un2 eventually,Un2Un=max{1,b3un1}max{1,b3un3}=Un2 eventually.

    If Vn>1=vn eventually, then by (2.9) we have

    {1vkUn1Vn1UkVk eventually,Vn=Vkun1vn1ukvk>1 eventually.

    Thus,

    {un=max{1,b3Un1,b3vkUn1Un2Vn2UkVk}=max{1,b3Un1}un2 eventually,Un=max{1,b3un1,b3Vkun1un2vn2ukvk}=max{1,b3Vkun1un2vn2ukvk}Un2 eventually,

    which implies

    {un2un=max{1,b3Un1}max{1,b3Un3}=un2 eventually,Un=1 eventually  or  b3Vk eventually.

    If Vn=1<vn eventually, then by (2.9) we have Un2=Un eventually and un=un1 eventually. By the above we see that {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    (3) If b<1<a, then for any kN and nk+2,

    {un=max{1,b3vkUn1Un2Vn2UkVk}un1,Un=max{1,b3Vkun1un2vn2ukvk}Un1,vn=max{1,a3Vn1,vkUn1Vn1UkVk},Vn=max{1,a3vn1,Vkun1vn1ukvk}. (2.10)

    It is easy to verify un=1 for any n3 or Un=1 for any n3.

    If un=Un=1 eventually, then

    {1b3vkUn1Un2Vn2UkVk eventually,1b3Vkun1un2vn2ukvk eventually.

    Thus, by (2.6) we have

    {vn2vn=max{1,a3Vn1,vkUn1Vn1UkVk}=max{1,a3Vn1}vn2 eventually,Vn2Vn=max{1,a3vn1,Vkun1vn1ukvk}=max{1,a3vn1}Vn2 eventually.

    If un=1<Un eventually, then

    {1b3vkUn1Un2Vn2UkVk eventually,1<b3Vkun1un2vn2ukvk=Un eventually.

    Thus,

    {vn2vn=max{1,a3Vn1,vkUn1Vn1UkVk}=max{1,a3Vn1}vn2 eventually,Vn=max{1,a3vn1,Vkun1vn1ukvk}=max{1,Vkun1vn1ukvk}=1 eventually or Vk eventually.

    If un>1=Un eventually, then we have Vn=Vn2 eventually and vn=1 eventually or vn=vk eventually.

    By the above we see that {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    Proposition 2.3. If ab=1<AB, then {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    Proof. Note that UnBaAun1 and VnAbBvn1. By (2.5) we see that there exists NN such that for any nN,

    {un=max{1,b2AUn1}un2,Un=max{1,BaAun1,BVkaAun1un2vn2ukvk},vn=max{1,a2BVn1}vn2,Vn=max{1,AbBvn1,Vkun1vn1ukvk}. (2.11)

    It is easy to verify that un=1 for any nN+1 or vn=1 for any nN+1.

    If un=vn=1 eventually, then by (2.11) we see that Un=Un1 eventually and Vn=Vn1 eventually.

    If uM+2n>1=vn eventually for some MN, then by (2.11) and (2.4) we see that

    {uM+2n=b2AUM+2n1>1 eventually,UM+2n+1=max{1,BbUM+2n1,BVkaAuM+2nuM+2n1vM+2n1ukvk}BbUM+2n1 eventually,vn=max{1,a2BVn1}=1 eventually,Vn=max{1,AbBvn1,Vkun1vn1ukvk}Vn1 eventually.

    By (2.11) we see that Un is bounded, which implies B=b.

    If UM+2n1BVkaAuM+2nuM+2n1vM+2n1ukvk eventually, then

    UM+2n+1=BVkaAuM+2nuM+2n1vM+2n1ukvkUM+2n1 eventually.

    Thus, UM+2n+1=UM+2n1 eventually and uM+2n=uM+2n2 eventually. Otherwise, we have UM+2n+1=UM+2n1 eventually and uM+2n=uM+2n2 eventually. Thus, Vn=Vn1=max{1,AbB} eventually since limnVkun1vn1ukvk=0. By (2.2) it follows UM+2n=UM+2n2 eventually and uM+2n+1=uM+2n1 eventually.

    If vM+2n>1=un eventually for some MN, then we may show that {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2. The proof is complete.

    Proposition 2.4. If ab>1, then {(un,Un,vn,Vn)}+n=1 is eventually periodic with period 2.

    Proof. By (2.5) we see that there exists NN such that for any nN,

    {un=max{1,baAUn1},Un=max{1,BaAun1},vn=max{1,abBVn1},Vn=max{1,AbBvn1}. (2.12)

    If a<A, then for n2k+N with kN,

    vn=max{1,abBVn1}max{1,aAvn2}max{1,(aA)kvn2k},

    which implies vn=1 eventually and Vn=max{1,AbB} eventually.

    If a=A, then

    {vn=max{1,abBVn1}vn2 eventually,Vn=max{1,AbBvn1}Vn2 eventually.

    Which implies

    {vn2vn=max{1,abBVn1}max{1,abBVn3}=vn2 eventually,Vn2Vn=max{1,AbBvn1}max{1,AbBvn3}=Vn2 eventually.

    Thus, Vn,vn are eventually periodic with period 2. In a similar way, we also may show that Un,un are eventually periodic with period 2. The proof is complete.

    From (2.1), (2.2), Proposition 2.1, Proposition 2.2, Proposition 2.3 and Proposition 2.4 one has the following theorem.

    Theorem 2.1. (1) If min{A0C1,B0D1,A1C0,B1D0}<1, then system (1.1) has unbounded solutions.

    (2) If min{A0C1,B0D1,A1C0,B1D0}1, then every solution of system (1.1) is eventually periodic with period 4.

    In this paper, we study the eventual periodicity of max-type system of difference equations of the second order with four variables and period-two parameters (1.1) and obtain characteristic conditions of the coefficients under which every positive solution of (1.1) is eventually periodic or not. For further research, we plan to study the eventual periodicity of more general max-type system of difference equations by the proof methods used in this paper.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    Project supported by NSF of Guangxi (2022GXNSFAA035552) and Guangxi First-class Discipline SCPF(2022SXZD01, 2022SXYB07) and Guangxi Key Laboratory BDFE(FED2204) and Guangxi University of Finance and Economics LSEICIC(2022YB12).

    There are no conflict of interest in this article.



    [1] J. Barzilai, J. M. Borwein, Two-point step size gradient methods, IMA J. Numer. Anal., 8 (1988), 141-148. doi: 10.1093/imanum/8.1.141
    [2] E. Bisong, Batch vs. online larning, Building Machine Learning and Deep Learning Models on Google Cloud Platform, 2019.
    [3] L. Bottou, F. E. Curtis, J. Nocedal, Optimization methods for large-scale machine learning, SIAM Rev., 60 (2018), 223-311. doi: 10.1137/16M1080173
    [4] Y. H. Dai, Y. Yuan, Nonlinear conjugate gradient methods, Shanghai: Shanghai Scientific Technical Publishers, 2000.
    [5] D. Davis, B. Grimmer, Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems, SIAM J. Optim., 29 (2019), 1908-1930. doi: 10.1137/17M1151031
    [6] R. Dehghani, N. Bidabadi, H. Fahs, M. M. Hosseini, A conjugate gradient method based on a modified secant relation for unconstrained optimization, Numer. Funct. Anal. Optim., 41 (2020), 621-634. doi: 10.1080/01630563.2019.1669641
    [7] P. Faramarzi, K. Amini, A modified spectral conjugate gradient method with global convergence, J. Optim. Theory Appl., 182 (2019), 667-690. doi: 10.1007/s10957-019-01527-6
    [8] R. Fletcher, C. M. Reeves, Function minimization by conjugate gradients, Comput. J., 7 (1964), 149-154. doi: 10.1093/comjnl/7.2.149
    [9] J. C. Gilbert, J. Nocedal, Global convergence properties of conjugate gradient methods for optimization, SIAM J. Optim., 2 (1992), 21-42. doi: 10.1137/0802003
    [10] W. W. Hager, H. Zhang, Algorithm 851: CG DESCENT, a conjugate gradient method with guaranteed descent, ACM Trans. Math. Software, 32 (2006), 113-137. doi: 10.1145/1132973.1132979
    [11] A. S. Halilu, M. Y. Waziri, Y. B. Musa, Inexact double step length method for solving systems of nonlinear equations, Stat. Optim. Inf. Comput., 8 (2020), 165-174. doi: 10.19139/soic-2310-5070-532
    [12] H. Jiang, P. Wilford, A stochastic conjugate gradient method for the approximation of functions, J. Comput. Appl. Math., 236 (2012), 2529-2544. doi: 10.1016/j.cam.2011.12.012
    [13] X. Jiang, J. Jian, Improved Fletcher-Reeves and Dai-Yuan conjugate gradient methods with the strong Wolfe line search, J. Comput. Appl. Math., 348 (2019), 525-534. doi: 10.1016/j.cam.2018.09.012
    [14] X. B. Jin, X. Y. Zhang, K. Huang, G. G. Geng, Stochastic conjugate gradient algorithm with variance reduction, IEEE Trans. Neural Networks Learn. Syst., 30 (2018), 1360-1369.
    [15] R. Johnson, T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems, 2013.
    [16] X. L. Li, Preconditioned stochastic gradient descent, IEEE Trans. Neural Networks Learn. Syst., 29 (2018), 1454-1466. doi: 10.1109/TNNLS.2017.2672978
    [17] Y. Liu, X. Wang, T. Guo, A linearly convergent stochastic recursive gradient method for convex optimization, Optim. Lett., 2020. Doi: 10.1007/s11590-020-01550-x. doi: 10.1007/s11590-020-01550-x
    [18] M. Lotfi, S. M. Hosseini, An efficient Dai-Liao type conjugate gradient method by reformulating the CG parameter in the search direction equation, J. Comput. Appl. Math., 371 (2020), 112708. doi: 10.1016/j.cam.2019.112708
    [19] S. Mandt, M. D. Hoffman, D. M. Blei, Stochastic gradient descent as approximate Bayesian inference, J. Mach. Learn. Res., 18 (2017), 4873-4907.
    [20] P. Moritz, R. Nishihara, M. I. Jordan, A linearly-convergent stochastic L-BFGS algorithm, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016.
    [21] L. M. Nguyen, J. Liu, K. Scheinberg, M. Takáč, SARAH: A novel method for machine learning problems using stochastic recursive gradient, Proceedings of the 34th International Conference on Machine Learning, 2017.
    [22] A. Nitanda, Accelerated stochastic gradient descent for minimizing finite sums, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016.
    [23] H. Robbins, S. Monro, A stochastic approximation method, Ann. Math. Statist., 22 (1951), 400-407. doi: 10.1214/aoms/1177729586
    [24] N. N. Schraudolph, T. Graepel, Combining conjugate direction methods with stochastic approximation of gradients, Proceedings of the 9th International Workshop on Artificial Intelligence and Statistics, 2003.
    [25] G. Shao, W. Xue, G. Yu, X. Zheng, Improved SVRG for finite sum structure optimization with application to binary classification, J. Ind. Manage. Optim., 16 (2020), 2253-2266.
    [26] C. Tan, S. Ma, Y. H. Dai, Y. Qian, Barzilai-Borwein step size for stochastic gradient descent, Advances in Neural Information Processing Systems, 2016.
    [27] P. Toulis, E. Airoldi, J. Rennie, Statistical analysis of stochastic gradient methods for generalized linear models, Proceedings of the 31th International Conference on Machine Learning, 2014.
    [28] V. Vapnik, The nature of statistical learning theory, New York: Springer, 1995.
    [29] L. Xiao, T. Zhang, A proximal stochastic gradient method with progressive variance reduction, SIAM J. Optim., 24 (2014), 2057-2075. doi: 10.1137/140961791
    [30] Z. Xu, Y. H. Dai, A stochastic approximation frame algorithm with adaptive directions, Numer. Math. Theory Methods Appl., 1 (2008), 460-474.
    [31] W. Xue, J. Ren, X. Zheng, Z. Liu, Y. Ling, A new DY conjugate gradient method and applications to image denoising, IEICE Trans. Inf. Syst., 101 (2018), 2984-2990.
    [32] Q. Zheng, X. Tian, N. Jiang, M. Yang, Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network, J. Intell. Fuzzy Syst., 37 (2019), 5641-5654. doi: 10.3233/JIFS-190861
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(7007) PDF downloads(322) Cited by(10)

Figures and Tables

Figures(17)  /  Tables(3)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog