In this paper, we investigated the coefficient-based regularized distribution regression for data generated by unbounded sampling processes. The algorithm adopts a two-stage sampling framework: the first-stage sample consists of probability distributions, from which the second-stage sample is drawn. A rigorous capacity-dependent convergence analysis was conducted under more general conditions, and its performance was comparable to that of one-stage sampling learning. Regularization was imposed on the coefficients and the kernel $ K $ was permitted to be indefinite. The important feature of this algorithm is that it can improve the saturation effect suffered by classical kernel ridge regression (KRR). Notably, the output sample values were assumed to satisfy a moment condition (rather than the stricter uniform boundedness constraint common in related works). We derived the convergence error bounds via the novel integral operator techniques, and further established the mini-max optimal learning rates of the algorithm, which were comparable to those achieved under bounded sampling settings.
Citation: Qin Guo, Shuli Liu. Coefficient-based regularized distribution regression under the moment conditions[J]. Electronic Research Archive, 2026, 34(1): 291-317. doi: 10.3934/era.2026014
In this paper, we investigated the coefficient-based regularized distribution regression for data generated by unbounded sampling processes. The algorithm adopts a two-stage sampling framework: the first-stage sample consists of probability distributions, from which the second-stage sample is drawn. A rigorous capacity-dependent convergence analysis was conducted under more general conditions, and its performance was comparable to that of one-stage sampling learning. Regularization was imposed on the coefficients and the kernel $ K $ was permitted to be indefinite. The important feature of this algorithm is that it can improve the saturation effect suffered by classical kernel ridge regression (KRR). Notably, the output sample values were assumed to satisfy a moment condition (rather than the stricter uniform boundedness constraint common in related works). We derived the convergence error bounds via the novel integral operator techniques, and further established the mini-max optimal learning rates of the algorithm, which were comparable to those achieved under bounded sampling settings.
| [1] |
S. A. Dong, W. C. Sun, Learning rate of distribution regression with dependent samples, J. Complex., 73 (2022), 101679. https://doi.org/10.1016/j.jco.2022.101679 doi: 10.1016/j.jco.2022.101679
|
| [2] | N Mücke, Stochastic gradient descent meets distribution regression, in Proceedings of the International Conference on Artificial Intelligence and Statistics, PMLR, 130 (2021), 2143–2151. |
| [3] |
Z. Yu, D. W. Ho, Z. Shi, D. X. Zhou, Robust kernel-based distribution regression, Inverse Probl., 37 (2021), 105014. https://doi.org/10.1088/1361-6420/ac23c3 doi: 10.1088/1361-6420/ac23c3
|
| [4] |
Z. Yu, D. W. Ho, Estimates on learning rates for multi-penalty distribution regression, Appl. Comput. Harmon. Anal., 69 (2024), 101609. https://doi.org/10.1016/j.acha.2023.101609 doi: 10.1016/j.acha.2023.101609
|
| [5] | A. J. Thorpe, M. M. Oishi, Stochastic optimal control via Hilbert space embeddings of distributions, in Proceedings of the 60th IEEE Conference on Decision and Control (CDC), IEEE, (2021), 904–911. https://doi.org/10.1109/CDC45484.2021.9682801 |
| [6] |
X. Q. Zheng, H. W. Sun, Q. Wu, Regularized least square kernel regression for streaming data, Commun. Math. Sci., 19 (2021), 1533–1548. https://doi.org/10.4310/CMS.2021.v19.n6.a4 doi: 10.4310/CMS.2021.v19.n6.a4
|
| [7] |
F. Bauer, S. Pereverzev, L. Rosasco, On regularization algorithms in learning theory, J. Complexity, 23 (2007), 52–72. https://doi.org/10.1016/j.jco.2006.07.001 doi: 10.1016/j.jco.2006.07.001
|
| [8] |
X. Guo, L. X. Li, Q. Wu, Modeling interactive components by coordinate kernel polynomial models, Math. Found. Comput., 3 (2020), 263–277. https://doi.org/10.3934/mfc.2020010 doi: 10.3934/mfc.2020010
|
| [9] | Z. Szabó, B. K. Sriperumbudur, B. Póczos, A. Gretton, Learning theory for distribution regression, J. Mach. Learn. Res., 17 (2016), 1–40. |
| [10] |
Z. Y. Fang, Z. C. Guo, D. X. Zhou, Optimal learning rates for distribution regression, J. Complexity, 56 (2020), 101426. https://doi.org/10.1016/j.jco.2019.101426 doi: 10.1016/j.jco.2019.101426
|
| [11] | B. Schölkopf, A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, Cambridge, MA, 2002. https://doi.org/10.7551/mitpress/4175.001.0001 |
| [12] |
C. J. Liu, Gabor-based kernel PCA with fractional power polynomial models for face recognition, IEEE Trans. Pattern Anal. Mach. Intell., 26 (2004), 572–581. https://doi.org/10.1109/TPAMI.2004.1273927 doi: 10.1109/TPAMI.2004.1273927
|
| [13] |
Q. Guo, Distributed semi-supervised regression learning with coefficient regularization, Results Math., 77 (2022), 63. https://doi.org/10.1007/s00025-021-01601-4 doi: 10.1007/s00025-021-01601-4
|
| [14] |
Z. C. Guo, L. Shi, Optimal rates for coefficient-based regularized regression, Appl. Comput. Harmon. Anal., 47 (2019), 662–701. https://doi.org/10.1016/j.acha.2017.11.005 doi: 10.1016/j.acha.2017.11.005
|
| [15] |
Q. Wu, Regularization networks with indefinite kernels, J. Approx. Theory, 166 (2013), 1–18. https://doi.org/10.1016/j.jat.2012.10.001 doi: 10.1016/j.jat.2012.10.001
|
| [16] |
Q. Guo, P. X. Ye, Error analysis for $\ell_q$-coefficient regularized moving least-square regression, J. Inequal. Appl., 2018 (2018), 262. https://doi.org/10.1186/s13660-018-1856-y doi: 10.1186/s13660-018-1856-y
|
| [17] |
T. Hu, R. J. Guo, Distributed robust regression with correntropy losses and regularization kernel networks, Anal. Appl., 22 (2024), 689–725. https://doi.org/10.1142/S0219530523500355 doi: 10.1142/S0219530523500355
|
| [18] |
B. Q. Su, H. W. Sun, Coefficient-based regularization network with variance loss for error, Int. J. Wavelets Multiresolut. Inf. Process., 19 (2021), 2050055. https://doi.org/10.1142/S0219691320500551 doi: 10.1142/S0219691320500551
|
| [19] |
S. A. Dong, W. C. Sun, Distributed learning and distribution regression of coefficient regularization, J. Approx. Theory, 263 (2021), 105523. https://doi.org/10.1016/j.jat.2020.105523 doi: 10.1016/j.jat.2020.105523
|
| [20] |
Y. Mao, L. Shi, Z. C. Guo, Coefficient-based regularized distribution regression, J. Approximation Theory, 297 (2024), 105995. https://doi.org/10.1016/j.jat.2023.105995 doi: 10.1016/j.jat.2023.105995
|
| [21] |
H. Z. Tong, Least squares regression under weak moment conditions, J. Comput. Appl. Math., 458 (2025), 116336. https://doi.org/10.1016/j.cam.2024.116336 doi: 10.1016/j.cam.2024.116336
|
| [22] |
H. Z. Tong, M. Ng, Convergence rates of regularized Huber regression under weak moment conditions, Anal. Appl., 23 (2025), 867–885. https://doi.org/10.1142/S0219530524410021 doi: 10.1142/S0219530524410021
|
| [23] |
Z. C. Guo, D. X. Zhou, Concentration estimates for learning with unbounded sampling, Adv. Comput. Math., 38 (2013), 207–223. https://doi.org/10.1007/s10444-011-9238-8 doi: 10.1007/s10444-011-9238-8
|
| [24] |
S. Y. Huang, Robust learning of Huber loss under weak conditional moment, Neurocomputing, 507 (2022), 191–198. https://doi.org/10.1016/j.neucom.2022.08.012 doi: 10.1016/j.neucom.2022.08.012
|
| [25] |
A. Caponnetto, E. D. Vito, Optimal rates for the regularized least-squares algorithm, Found. Comput. Math., 7 (2007), 331–368. https://doi.org/10.1007/s10208-006-0196-8 doi: 10.1007/s10208-006-0196-8
|
| [26] | Z. Szabó, A. Gretton, B. Póczos, B. Sriperumbudur, Two-stage sampled learning theory on distributions, in Proceedings of the 18th International Conference on Artificial Intelligence and Statistics (AISTATS 2015), PMLR, San Diego, USA, (2015), 948–957. |
| [27] |
C. Wang, D. X. Zhou, Optimal learning rates for least squares regularized regression with unbounded sampling, J. Complexity, 27 (2011), 55–67. https://doi.org/10.1016/j.jco.2010.10.002 doi: 10.1016/j.jco.2010.10.002
|
| [28] |
S. Smale, D. X. Zhou, Learning theory estimates via integral operators and their approximations, Constr. Approximation, 26 (2007), 153–172. https://doi.org/10.1007/s00365-006-0659-y doi: 10.1007/s00365-006-0659-y
|
| [29] | R. V. Kadison, J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Volume II: Advanced Theory, Academic Press, New York, 1986. |