Research article

Coefficient-based regularized distribution regression under the moment conditions

  • Published: 09 January 2026
  • In this paper, we investigated the coefficient-based regularized distribution regression for data generated by unbounded sampling processes. The algorithm adopts a two-stage sampling framework: the first-stage sample consists of probability distributions, from which the second-stage sample is drawn. A rigorous capacity-dependent convergence analysis was conducted under more general conditions, and its performance was comparable to that of one-stage sampling learning. Regularization was imposed on the coefficients and the kernel $ K $ was permitted to be indefinite. The important feature of this algorithm is that it can improve the saturation effect suffered by classical kernel ridge regression (KRR). Notably, the output sample values were assumed to satisfy a moment condition (rather than the stricter uniform boundedness constraint common in related works). We derived the convergence error bounds via the novel integral operator techniques, and further established the mini-max optimal learning rates of the algorithm, which were comparable to those achieved under bounded sampling settings.

    Citation: Qin Guo, Shuli Liu. Coefficient-based regularized distribution regression under the moment conditions[J]. Electronic Research Archive, 2026, 34(1): 291-317. doi: 10.3934/era.2026014

    Related Papers:

  • In this paper, we investigated the coefficient-based regularized distribution regression for data generated by unbounded sampling processes. The algorithm adopts a two-stage sampling framework: the first-stage sample consists of probability distributions, from which the second-stage sample is drawn. A rigorous capacity-dependent convergence analysis was conducted under more general conditions, and its performance was comparable to that of one-stage sampling learning. Regularization was imposed on the coefficients and the kernel $ K $ was permitted to be indefinite. The important feature of this algorithm is that it can improve the saturation effect suffered by classical kernel ridge regression (KRR). Notably, the output sample values were assumed to satisfy a moment condition (rather than the stricter uniform boundedness constraint common in related works). We derived the convergence error bounds via the novel integral operator techniques, and further established the mini-max optimal learning rates of the algorithm, which were comparable to those achieved under bounded sampling settings.



    加载中


    [1] S. A. Dong, W. C. Sun, Learning rate of distribution regression with dependent samples, J. Complex., 73 (2022), 101679. https://doi.org/10.1016/j.jco.2022.101679 doi: 10.1016/j.jco.2022.101679
    [2] N Mücke, Stochastic gradient descent meets distribution regression, in Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, 130 (2021), 2143–2151.
    [3] Z. Yu, D. W. Ho, Z. Shi, D. X. Zhou, Robust kernel-based distribution regression, Inverse Probl., 37 (2021), 105014. https://doi.org/10.1088/1361-6420/ac23c3 doi: 10.1088/1361-6420/ac23c3
    [4] Z. Yu, D. W. Ho, Estimates on learning rates for multi-penalty distribution regression, Appl. Comput. Harmon. Anal., 69 (2024), 101609. https://doi.org/10.1016/j.acha.2023.101609 doi: 10.1016/j.acha.2023.101609
    [5] A. J. Thorpe, M. M. Oishi, Stochastic optimal control via Hilbert space embeddings of distributions, in Proceedings of the 60th IEEE Conference on Decision and Control (CDC), IEEE, (2021), 904–911. https://doi.org/10.1109/CDC45484.2021.9682801
    [6] X. Q. Zheng, H. W. Sun, Q. Wu, Regularized least square kernel regression for streaming data, Commun. Math. Sci., 19 (2021), 1533–1548. https://doi.org/10.4310/CMS.2021.v19.n6.a4 doi: 10.4310/CMS.2021.v19.n6.a4
    [7] F. Bauer, S. Pereverzev, L. Rosasco, On regularization algorithms in learning theory, J. Complexity, 23 (2007), 52–72. https://doi.org/10.1016/j.jco.2006.07.001 doi: 10.1016/j.jco.2006.07.001
    [8] X. Guo, L. X. Li, Q. Wu, Modeling interactive components by coordinate kernel polynomial models, Math. Found. Comput., 3 (2020), 263–277. https://doi.org/10.3934/mfc.2020010 doi: 10.3934/mfc.2020010
    [9] Z. Szabó, B. K. Sriperumbudur, B. Póczos, A. Gretton, Learning theory for distribution regression, J. Mach. Learn. Res., 17 (2016), 1–40.
    [10] Z. Y. Fang, Z. C. Guo, D. X. Zhou, Optimal learning rates for distribution regression, J. Complexity, 56 (2020), 101426. https://doi.org/10.1016/j.jco.2019.101426 doi: 10.1016/j.jco.2019.101426
    [11] B. Schölkopf, A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge, MA, 2002. https://doi.org/10.7551/mitpress/4175.001.0001
    [12] C. J. Liu, Gabor-based kernel PCA with fractional power polynomial models for face recognition, IEEE Trans. Pattern Anal. Mach. Intell., 26 (2004), 572–581. https://doi.org/10.1109/TPAMI.2004.1273927 doi: 10.1109/TPAMI.2004.1273927
    [13] Q. Guo, Distributed semi-supervised regression learning with coefficient regularization, Results Math., 77 (2022), 63. https://doi.org/10.1007/s00025-021-01601-4 doi: 10.1007/s00025-021-01601-4
    [14] Z. C. Guo, L. Shi, Optimal rates for coefficient-based regularized regression, Appl. Comput. Harmon. Anal., 47 (2019), 662–701. https://doi.org/10.1016/j.acha.2017.11.005 doi: 10.1016/j.acha.2017.11.005
    [15] Q. Wu, Regularization networks with indefinite kernels, J. Approx. Theory, 166 (2013), 1–18. https://doi.org/10.1016/j.jat.2012.10.001 doi: 10.1016/j.jat.2012.10.001
    [16] Q. Guo, P. X. Ye, Error analysis for $\ell_q$-coefficient regularized moving least-square regression, J. Inequal. Appl., 2018 (2018), 262. https://doi.org/10.1186/s13660-018-1856-y doi: 10.1186/s13660-018-1856-y
    [17] T. Hu, R. J. Guo, Distributed robust regression with correntropy losses and regularization kernel networks, Anal. Appl., 22 (2024), 689–725. https://doi.org/10.1142/S0219530523500355 doi: 10.1142/S0219530523500355
    [18] B. Q. Su, H. W. Sun, Coefficient-based regularization network with variance loss for error, Int. J. Wavelets Multiresolut. Inf. Process., 19 (2021), 2050055. https://doi.org/10.1142/S0219691320500551 doi: 10.1142/S0219691320500551
    [19] S. A. Dong, W. C. Sun, Distributed learning and distribution regression of coefficient regularization, J. Approx. Theory, 263 (2021), 105523. https://doi.org/10.1016/j.jat.2020.105523 doi: 10.1016/j.jat.2020.105523
    [20] Y. Mao, L. Shi, Z. C. Guo, Coefficient-based regularized distribution regression, J. Approximation Theory, 297 (2024), 105995. https://doi.org/10.1016/j.jat.2023.105995 doi: 10.1016/j.jat.2023.105995
    [21] H. Z. Tong, Least squares regression under weak moment conditions, J. Comput. Appl. Math., 458 (2025), 116336. https://doi.org/10.1016/j.cam.2024.116336 doi: 10.1016/j.cam.2024.116336
    [22] H. Z. Tong, M. Ng, Convergence rates of regularized Huber regression under weak moment conditions, Anal. Appl., 23 (2025), 867–885. https://doi.org/10.1142/S0219530524410021 doi: 10.1142/S0219530524410021
    [23] Z. C. Guo, D. X. Zhou, Concentration estimates for learning with unbounded sampling, Adv. Comput. Math., 38 (2013), 207–223. https://doi.org/10.1007/s10444-011-9238-8 doi: 10.1007/s10444-011-9238-8
    [24] S. Y. Huang, Robust learning of Huber loss under weak conditional moment, Neurocomputing, 507 (2022), 191–198. https://doi.org/10.1016/j.neucom.2022.08.012 doi: 10.1016/j.neucom.2022.08.012
    [25] A. Caponnetto, E. D. Vito, Optimal rates for the regularized least-squares algorithm, Found. Comput. Math., 7 (2007), 331–368. https://doi.org/10.1007/s10208-006-0196-8 doi: 10.1007/s10208-006-0196-8
    [26] Z. Szabó, A. Gretton, B. Póczos, B. Sriperumbudur, Two-stage sampled learning theory on distributions, in Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS 2015), PMLR, San Diego, USA, (2015), 948–957.
    [27] C. Wang, D. X. Zhou, Optimal learning rates for least squares regularized regression with unbounded sampling, J. Complexity, 27 (2011), 55–67. https://doi.org/10.1016/j.jco.2010.10.002 doi: 10.1016/j.jco.2010.10.002
    [28] S. Smale, D. X. Zhou, Learning theory estimates via integral operators and their approximations, Constr. Approximation, 26 (2007), 153–172. https://doi.org/10.1007/s00365-006-0659-y doi: 10.1007/s00365-006-0659-y
    [29] R. V. Kadison, J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Volume II: Advanced Theory, Academic Press, New York, 1986.
  • Reader Comments
  • © 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(306) PDF downloads(12) Cited by(0)

Article outline

Figures and Tables

Figures(1)  /  Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog