A novel concentration inequality for the sum of independent sub-Gaussian variables with random dependent weights is introduced in statistical settings for high-dimensional data. The random dependent weights are functions of some regularized estimators. We applied the proposed concentration inequality to obtain a high probability bound for the stochastic Lipschitz constant for negative binomial loss functions involved in Lasso-penalized negative binomial regressions. We used this bound to study oracle inequalities for Lasso estimators. Additionally, a similar concentration inequality was derived for a randomly weighted sum of independent centred exponential family variables.
Citation: Huiming Zhang, Hengzhen Huang. Concentration for multiplier empirical processes with dependent weights[J]. AIMS Mathematics, 2023, 8(12): 28738-28752. doi: 10.3934/math.20231471
A novel concentration inequality for the sum of independent sub-Gaussian variables with random dependent weights is introduced in statistical settings for high-dimensional data. The random dependent weights are functions of some regularized estimators. We applied the proposed concentration inequality to obtain a high probability bound for the stochastic Lipschitz constant for negative binomial loss functions involved in Lasso-penalized negative binomial regressions. We used this bound to study oracle inequalities for Lasso estimators. Additionally, a similar concentration inequality was derived for a randomly weighted sum of independent centred exponential family variables.
| [1] | V. V. Buldygin, Y. V. Kozachenko, Metric characterization of random variables and random processes, Providence: American Mathematical Society, 2000. |
| [2] | S. Boucheron, G. Lugosi, P. Massart, Concentration inequalities: A nonasymptotic theory of independence, Oxford: Oxford University Press, 2013. |
| [3] | P. Bühlmann, S. A. van de Geer, Statistics for high-dimensional data: methods, theory and applications, Berlin: Springer, 2011. https://doi.org/10.1007/978-3-642-20192-9 |
| [4] | Z. Chi, A local stochastic Lipschitz condition with application to Lasso for high dimensional generalized linear models, arXiv: 1009.1052. https://doi.org/10.48550/arXiv.1009.1052 |
| [5] | D. Halikias, B. Klartag, B. A. Slomka, Discrete variants of Brunn-Minkowski type inequalities, Annales de la Faculté des Sciences de Toulouse Mathématiques, 30 (2021), 267–279. https://doi.org/10.5802/afst.1674 |
| [6] |
Q. Han, J. A. Wellner, Convergence rates of least squares regression estimators with heavy-tailed errors, Ann. Statist., 47 (2019), 2286–2319. https://doi.org/10.1214/18-AOS1748 doi: 10.1214/18-AOS1748
|
| [7] |
Q. Han, Multiplier U-processes: sharp bounds and applications, Bernoulli, 28 (2022), 87–124. https://doi.org/10.3150/21-BEJ1334 doi: 10.3150/21-BEJ1334
|
| [8] |
W. Hoeffding, Probability inequalities for sums of bounded random variables, J. Am. Stat. Assoc., 58 (1963), 13–30. https://doi.org/10.1080/01621459.1963.10500830 doi: 10.1080/01621459.1963.10500830
|
| [9] |
J. Kahane, Propriétés locales des fonctions à séries de Fourier aléatoires, Stud. Math., 19 (1960), 1–25. https://doi.org/10.4064/sm-19-1-1-25 doi: 10.4064/sm-19-1-1-25
|
| [10] |
S. Li, H. Wei, X. Lei, Heterogeneous overdispersed count data regressions via double-penalized estimations, Mathematics, 10 (2022), 1700. https://doi.org/10.3390/math10101700 doi: 10.3390/math10101700
|
| [11] |
S. Mendelson, Upper bounds on product and multiplier empirical processes, Stoch. Proc. Appl., 126 (2016), 3652–3680. https://doi.org/10.1016/j.spa.2016.04.019 doi: 10.1016/j.spa.2016.04.019
|
| [12] | S. Moriguchi, K. Murota, A. Tamura, F. Tardella, Discrete midpoint convexity, Math. Oper. Res., 45 (2020), 99–128. https://doi.org/10.1287/moor.2018.0984 |
| [13] | M. W. Mahoney, J. C. Duchi, A. C. Gilbert, The mathematics of data, Providence: American Mathematical Society, 2018. |
| [14] |
P. Massart, Some applications of concentration inequalities to statistics, Annales de la Facult des Sciences de Toulouse Mathmatiques, 9 (2000), 245–303. https://doi.org/10.5802/afst.961 doi: 10.5802/afst.961
|
| [15] | P. Rigollet, J. C. Hütter, High dimensional statistics, New York: Spring, 2019. |
| [16] | R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, arXiv: 1011.3027. https://doi.org/10.48550/arXiv.1011.3027 |
| [17] | A. W. Vaart, J. A. Wellner, Weak convergence and empirical processes: with applications to statistics, New York: Springer, 1996. https://doi.org/10.1007/978-1-4757-2545-2 |
| [18] | M. J. Wainwright, High-dimensional statistics: a non-asymptotic viewpoint, Cambridge: Cambridge University Press, 2019. |
| [19] |
Ü. Yüceer, Discrete convexity: convexity for functions defined on discrete spaces, Discrete Appl. Math., 119 (2002), 297–304. https://doi.org/10.1016/S0166-218X(01)00191-3 doi: 10.1016/S0166-218X(01)00191-3
|
| [20] |
H. Zhang, S. Chen, Concentration inequalities for statistical inference, Commun. Math. Res., 37 (2021), 1–85 https://doi.org/10.4208/cmr.2020-0041 doi: 10.4208/cmr.2020-0041
|
| [21] |
H. Zhang, J. Jia, Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signals detection, Stat. Sinica, 32 (2022), 181–207. https://doi.org/10.5705/SS.202019.0315 doi: 10.5705/SS.202019.0315
|
| [22] |
H. Zhang, X. Lei, Growing-dimensional partially functional linear models: non-asymptotic optimal prediction error, Phys. Scr., 98 (2023), 095216. https://doi.org/10.1088/1402-4896/aceac0 doi: 10.1088/1402-4896/aceac0
|
| [23] | H. Zhang, H. Wei, G. Cheng, Tight non-asymptotic inference via sub-Gaussian intrinsic moment norm, arXiv: 2303.07287. https://doi.org/10.48550/arXiv.2303.07287 |