Convergence of online learning algorithm with a parameterized loss

Shuhua Wang; Shuhua Wang

doi:10.3934/math.20221098

AIMS Mathematics

2022, Volume 7, Issue 11: 20066-20084. doi: 10.3934/math.20221098

Previous Article Next Article

Research article

Convergence of online learning algorithm with a parameterized loss

Shuhua Wang ^,

School of Information Engineering, Jingdezhen Ceramic University, Jingdezhen, China

Received: 19 July 2022 Revised: 27 August 2022 Accepted: 06 September 2022 Published: 13 September 2022
MSC : 41A25, 68Q32, 68T40, 90C25

The research on the learning performance of machine learning algorithms is one of the important contents of machine learning theory, and the selection of loss function is one of the important factors affecting the learning performance. In this paper, we introduce a parameterized loss function into the online learning algorithm and investigate the performance. By applying convex analysis techniques, the convergence of the learning sequence is proved and the convergence rate is provided in the expectation sense. The analysis results show that the convergence rate can be greatly improved by adjusting the parameter in the loss function.
- online learning,
- parameterized loss,
- convergence rate,
- reproducing Hilbert space,
- convex analysis
Citation: Shuhua Wang. Convergence of online learning algorithm with a parameterized loss[J]. AIMS Mathematics, 2022, 7(11): 20066-20084. doi: 10.3934/math.20221098

Related Papers:

Abstract

The research on the learning performance of machine learning algorithms is one of the important contents of machine learning theory, and the selection of loss function is one of the important factors affecting the learning performance. In this paper, we introduce a parameterized loss function into the online learning algorithm and investigate the performance. By applying convex analysis techniques, the convergence of the learning sequence is proved and the convergence rate is provided in the expectation sense. The analysis results show that the convergence rate can be greatly improved by adjusting the parameter in the loss function.

References

[1]	N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc., 68 (1950), 337–404. http://dx.doi.org/10.2307/1990404 doi: 10.2307/1990404
[2]	W. Dai, J. Hu, Y. Cheng, X. Wang, T. Chai, RVFLN-based online adaptive semi-supervised learning algorithm with application to product quality estimation of industrial processes, J. Cent. South Univ., 26 (2019), 3338–3350. http://dx.doi.org/10.1007/s11771-019-4257-6 doi: 10.1007/s11771-019-4257-6
[3]	J. Gui, Y. Liu, X. Deng, B. Liu, Network capacity optimization for Cellular-assisted vehicular systems by online learning-based mmWave beam selection, Wirel. Commun. Mob. Com., 2021 (2021), 8876186. http://dx.doi.org/10.1155/2021/8876186 doi: 10.1155/2021/8876186
[4]	M. Li, I. Sethi, A new online learning algorithm with application to image segmentation, Image Processing: Algorithms and Systems IV, 5672 (2005), 277–286. http://dx.doi.org/10.1117/12.586328 doi: 10.1117/12.586328
[5]	S. Sai Santosh, S. Darak, Intelligent and reconfigurable architecture for KL divergence based online machine learning algorithm, arXiv: 2002.07713.
[6]	B. Yang, J. Yao, X. Yang, Y. Shi, Painting image classification using online learning algorithm, In: Distributed, ambient and pervasive interactions, Cham: Springer, 2017,393–403. http://dx.doi.org/10.1007/978-3-319-58697-7_29
[7]	S. Das, Kuhoo, D. Mishra, M. Rout, An optimized feature reduction based currency forecasting model exploring the online sequential extreme learning machine and krill herd strategies, Physica A, 513 (2019), 339–370. http://dx.doi.org/10.1016/j.physa.2018.09.021 doi: 10.1016/j.physa.2018.09.021
[8]	S. Smale, Y. Yao, Online learning algorithms, Found. Comput. Math., 6 (2006), 145–170. http://dx.doi.org/10.1007/s10208-004-0160-z
[9]	Y. Ying, D. Zhou, Online regularized classification algorithms, IEEE Trans. Inform. Theory, 52 (2006), 4775–4788. http://dx.doi.org/10.1109/TIT.2006.883632 doi: 10.1109/TIT.2006.883632
[10]	Y. Ying, D. Zhou, Unregularized online learning algorithms with general loss functions, Appl. Comput. Harmon. Anal., 42 (2017), 224–244. http://dx.doi.org/10.1016/J.ACHA.2015.08.007 doi: 10.1016/J.ACHA.2015.08.007
[11]	Y. Zeng, D. Klabjian, Online adaptive machine learning based algorithm for implied volatility surface modeling, Knowl.-Based Syst., 163 (2019), 376–391. http://dx.doi.org/10.1016/j.knosys.2018.08.039 doi: 10.1016/j.knosys.2018.08.039
[12]	J. Lin, D. Zhou, Online learning algorithms can converge comparably fast as batch learning, IEEE Trans. Neural Netw. Learn. Syst., 29 (2018), 2367–2378. http://dx.doi.org/10.1109/TNNLS.2017.2677970 doi: 10.1109/TNNLS.2017.2677970
[13]	P. Huber, E. Ronchetti, Robust statistics, Hoboken: John Wiley & Sons, 2009. http://dx.doi.org/10.1002/9780470434697
[14]	Y. Wu, Y. Liu, Robust truncated hinge loss support vector machine, J. Am. Stat. Assoc., 102 (2007), 974–983. http://dx.doi.org/10.1198/016214507000000617 doi: 10.1198/016214507000000617
[15]	Y. Yu, M. Yang, L. Xu, M. White, D. Schuurmans, Relaxed clipping: a global training method for robust regression and classification, Proceedings of the 23rd International Conference on Neural Information Processing Systems, 2 (2010), 2532–2540.
[16]	S. Huang, Y. Feng, Q. Wu, Learning theory of minimum error entropy under weak moment conditions, Anal. Appl., 20 (2022), 121–139. http://dx.doi.org/10.1142/S0219530521500044 doi: 10.1142/S0219530521500044
[17]	F. Lv, J. Fan, Optimal learning with Gaussians and correntropy loss, Anal. Appl., 19 (2021), 107–124. http://dx.doi.org/10.1142/S0219530519410124 doi: 10.1142/S0219530519410124
[18]	X. Zhu, Z. Li, J. Sun, Expression recognition method combining convolutional features and Transformer, Math. Found. Compt., in press. http://dx.doi.org/10.3934/mfc.2022018
[19]	S. Suzumura, K. Ogawa, M. Sugiyama, M. Karasuyama, I. Takeuchi, Homotopy continuation approaches for robust SV classification and regression, Mach. Learn., 106 (2017), 1009–1038. http://dx.doi.org/10.1007/s10994-017-5627-7 doi: 10.1007/s10994-017-5627-7
[20]	Z. Guo, T. Hu, L. Shi, Gradient descent for robust kernel-based regression, Inverse Probl., 34 (2018), 065009. http://dx.doi.org/10.1088/1361-6420/aabe55 doi: 10.1088/1361-6420/aabe55
[21]	B. Sheng, H. Zhu, The convergence rate of semi-supervised regression with quadratic loss, Appl. Math. Comput., 321 (2018), 11–24. http://dx.doi.org/10.1016/j.amc.2017.10.033 doi: 10.1016/j.amc.2017.10.033
[22]	M. Pontil, Y. Ying, D. Zhou, Error analysis for online gradient descent algorithms in reproducing kernel Hilbert spaces, Proceedings of Technical Report, University College London, 2005, 1–20.
[23]	S. Wang, Z. Chen, B. Sheng, Convergence of online pairwise regression learning with quadratic loss, Commun. Pur. Appl. Anal., 19 (2020), 4023–4054. http://dx.doi.org/10.3934/cpaa.2020178 doi: 10.3934/cpaa.2020178
[24]	H. Bauschke, P. Combettes, Convex analysis and monotone operator theory in Hilber spaces, Cham: Springer-Verlag, 2010. http://dx.doi.org/10.1007/978-3-319-48311-5
[25]	Z. Guo, L. Shi, Fast and strong convergence of online learning algorithms, Adv. Comput. Math., 45 (2019), 2745–2770. http://dx.doi.org/10.1007/s10444-019-09707-8 doi: 10.1007/s10444-019-09707-8
[26]	Y. Lei, D. Zhou, Convergence of online mirror descent, Appl. Comput. Harmon. Anal., 48 (2020), 343–373. http://dx.doi.org/10.1016/j.acha.2018.05.005 doi: 10.1016/j.acha.2018.05.005
[27]	I. Baloch, T. Abdeljawad, S. Bibi, A. Mukheimer, G. Farid, A. Haq, Some new Caputo fractional derivative inequalities for exponentially $(\theta, h-m)$-convex functions, AIMS Mathematics, 7 (2022), 3006–3026. http://dx.doi.org/10.3934/math.2022166 doi: 10.3934/math.2022166
[28]	P. Mohammed, D. O'Regan, A. Brzo, K. Abualnaja, D. Baleanu, Analysis of positivity results for discrete fractional operators by means of exponential kernels, AIMS Mathematics, 7 (2022), 15812–15823. http://dx.doi.org/10.3934/math.2022865 doi: 10.3934/math.2022865
[29]	Y. Xia, J. Zhou, T. Xu, W. Gao, An improved deep convolutional neural network model with kernel loss function in image classifiaction, Math. Found. Comput., 3 (2020), 51–64. http://dx.doi.org/10.3934/mfc.2020005 doi: 10.3934/mfc.2020005
[30]	D. Zhou, Deep distributed convolutional neural networks: universality, Anal. Appl., 16 (2018), 895–919. http://dx.doi.org/10.1142/S0219530518500124 doi: 10.1142/S0219530518500124
[31]	D. Zhou, Universality of deep convolutional neural networks, Appl. Comput. Harmon. Anal., 48 (2020), 787–794. http://dx.doi.org/10.1016/j.acha.2019.06.004 doi: 10.1016/j.acha.2019.06.004
[32]	D. Zhou, Theory of deep convolutional neural networks: downsampling, Neural Networks, 124 (2020), 319–327. http://dx.doi.org/10.1016/j.neunet.2020.01.018 doi: 10.1016/j.neunet.2020.01.018

Reader Comments

Your name:*

Email:*
© 2022 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)