Predicting a functional response from scalar predictors is challenging, especially with complex data structures. Traditional function-on-scalar regression (FOSR) methods emphasize smoothness or sparsity, but few address group structures in functional data. To address this gap, we introduce the network function-on-scalar Lasso (NFL), an innovative FOSR model that integrates simultaneous clustering and optimization (SCO) principles. The NFL model introduces a graph-structured sum-of-norms regularization to encourage similar functional responses for related observations (e.g., neighboring regions), while also performing sparse variable selection. An efficient semi-proximal alternating direction method of multipliers (ADMM) algorithm is developed for model estimation, scaling to high-dimensional functional data. We provide theoretical guarantees for the NFL estimator under regularity conditions, ensuring model accuracy and insight into its clustering consistency. Simulations and an environmental application predicting US county-level air quality trends demonstrate the NFL's superior prediction accuracy and ability to uncover meaningful group structures compared to existing methods.
Citation: Shan Sha, Yan Li. Simultaneous clustering and optimization in function-on-scalar regression[J]. AIMS Mathematics, 2025, 10(8): 17518-17542. doi: 10.3934/math.2025783
Predicting a functional response from scalar predictors is challenging, especially with complex data structures. Traditional function-on-scalar regression (FOSR) methods emphasize smoothness or sparsity, but few address group structures in functional data. To address this gap, we introduce the network function-on-scalar Lasso (NFL), an innovative FOSR model that integrates simultaneous clustering and optimization (SCO) principles. The NFL model introduces a graph-structured sum-of-norms regularization to encourage similar functional responses for related observations (e.g., neighboring regions), while also performing sparse variable selection. An efficient semi-proximal alternating direction method of multipliers (ADMM) algorithm is developed for model estimation, scaling to high-dimensional functional data. We provide theoretical guarantees for the NFL estimator under regularity conditions, ensuring model accuracy and insight into its clustering consistency. Simulations and an environmental application predicting US county-level air quality trends demonstrate the NFL's superior prediction accuracy and ability to uncover meaningful group structures compared to existing methods.
| [1] |
J. S. Morris, Functional regression, Annu. Rev. Stat. Appl., 2 (2015), 321–359. https://doi.org/10.1146/annurev-statistics-010814-020413 doi: 10.1146/annurev-statistics-010814-020413
|
| [2] |
H. Cardot, F. Ferraty, P. Sarda, Functional linear model, Stat. Probabil. Lett., 45 (1999), 11–22. https://doi.org/10.1016/S0167-7152(99)00036-X doi: 10.1016/S0167-7152(99)00036-X
|
| [3] |
F. Yao, H. G. Müller, J. L. Wang, Functional linear regression analysis for longitudinal data, Ann. Stat., 33 (2005), 2873–2903. https://doi.org/10.1214/009053605000000660 doi: 10.1214/009053605000000660
|
| [4] | G. Wahba, Spline models for observational data, SIAM, 1990. |
| [5] | C. Gu, C. Gu, Smoothing spline ANOVA models, Springer, 297 (2013). |
| [6] |
G. M. James, J. Wang, J. Zhu, Functional linear regression that's interpretable, Ann. Stat., 37 (2009), 2083–2108. https://doi.org/10.1109/TPS.2009.2033220 doi: 10.1109/TPS.2009.2033220
|
| [7] |
A. Goia, P. Vieu, An introduction to recent advances in high/infinite dimensional statistics, J. Multivariate Anal., 146 (2016), 1–6. https://doi.org/10.1016/j.jmva.2015.12.001 doi: 10.1016/j.jmva.2015.12.001
|
| [8] |
G. Aneiros, R. Cao, R. Fraiman, C. Genest, P. Vieu, Recent advances in functional data analysis and high-dimensional statistics, J. Multivariate Anal., 170 (2019), 3–9. https://doi.org/10.1016/j.jmva.2018.11.007 doi: 10.1016/j.jmva.2018.11.007
|
| [9] |
N. Ling, P. Vieu, Nonparametric modelling for functional data: Selected survey and tracks for future, Statistics, 52 (2018), 934–949. https://doi.org/10.1080/02331888.2018.1487120 doi: 10.1080/02331888.2018.1487120
|
| [10] |
N. Ling, P. Vieu, On semiparametric regression in functional data analysis, WIREs Comput. Stat., 13 (2021), e1538. https://doi.org/10.1002/wics.1538 doi: 10.1002/wics.1538
|
| [11] |
G. Aneiros, S. Novo, P. Vieu, Variable selection in functional regression models: A review, J. Multivariate Anal., 188 (2022), 104871. https://doi.org/10.1016/j.jmva.2021.104871 doi: 10.1016/j.jmva.2021.104871
|
| [12] |
P. T. Reiss, R. T. Ogden, Functional principal component regression and functional partial least squares, J. Am. Stat. Assoc., 102 (2007), 984–996. https://doi.org/10.1198/016214507000000527 doi: 10.1198/016214507000000527
|
| [13] |
C. M. Crainiceanu, A. M. Staicu, C. Z. Di, Generalized multilevel functional regression, J. Am. Stat. Assoc., 104 (2009), 1550–1561. https://doi.org/10.1198/jasa.2009.tm08564 doi: 10.1198/jasa.2009.tm08564
|
| [14] |
R. A. Maronna, V. J. Yohai, Robust functional linear regression based on splines, Comput. Stat. Data Anal., 65 (2013), 46–55. https://doi.org/10.1016/j.csda.2011.11.014 doi: 10.1016/j.csda.2011.11.014
|
| [15] |
H. Zhu, F. Yao, H. H. Zhang, Structured functional additive regression in reproducing kernel Hilbert spaces, J. Roy. Stat. Soc. B, 76 (2014), 581–603. https://doi.org/10.1111/rssb.12036 doi: 10.1111/rssb.12036
|
| [16] |
A. Kneip, D. Poß, P. Sarda, Functional linear regression with points of impact, Ann. Stat., 44 (2016), 1–30. https://doi.org/10.1214/15-AOS1323 doi: 10.1214/15-AOS1323
|
| [17] |
J. Fan, J. T. Zhang, Two-step estimation of functional linear models with applications to longitudinal data, J. Roy. Stat. Soc. B, 62 (2000), 303–322. https://doi.org/10.1111/1467-9868.00233 doi: 10.1111/1467-9868.00233
|
| [18] |
J. Goldsmith, J. Bobb, C. M. Crainiceanu, B. Caffo, D. Reich, Penalized functional regression, J. Comput. Graph. Stat., 20 (2011), 830–851. https://doi.org/10.1198/jcgs.2010.10007 doi: 10.1198/jcgs.2010.10007
|
| [19] |
M. W. McLean, G. Hooker, A.-M. Staicu, F. Scheipl, D. Ruppert, Functional generalized additive models, J. Comput. Graph. Stat., 23 (2014), 249–269. https://doi.org/10.1080/10618600.2012.729985 doi: 10.1080/10618600.2012.729985
|
| [20] |
Y. Li, T. Hsing, Uniform convergence rates for nonparametric regression and principal component analysis in functional/longitudinal data, Ann. Stat., 38 (2010), 3321–3351, https://doi.org/10.1214/10-AOS813 doi: 10.1214/10-AOS813
|
| [21] |
H. Zhu, P. J. Brown, J. S. Morris, Robust, adaptive functional regression in functional mixed model framework, J. Am. Stat. Assoc., 106 (2011), 1167–1179. https://doi.org/10.1198/jasa.2011.tm10370 doi: 10.1198/jasa.2011.tm10370
|
| [22] |
F. C. Stingo, M. Vannucci, G. Downey, Bayesian wavelet-based curve classification via discriminant analysis with markov random tree priors, Stat. Sin., 22 (2012), 465. https://doi.org/10.1177/0962280213506395 doi: 10.1177/0962280213506395
|
| [23] |
R. Tibshirani, Regression shrinkage and selection via the Lasso, J. Roy. Stat. Soc. B, 58 (1996), 267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x doi: 10.1111/j.2517-6161.1996.tb02080.x
|
| [24] |
H. Zou, T. Hastie, Regularization and variable selection via the elastic net, J. Roy. Stat. Soc. B, 67 (2005), 301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x doi: 10.1111/j.1467-9868.2005.00503.x
|
| [25] |
C. H. Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann. Stat., 38 (2010), 894–942. https://doi.org/10.1214/09-AOS729 doi: 10.1214/09-AOS729
|
| [26] |
Y. Chen, J. Goldsmith, R. T. Ogden, Variable selection in function-on-scalar regression, Stat, 5 (2016), 88–101. https://doi.org/10.1002/sta4.106 doi: 10.1002/sta4.106
|
| [27] |
R. F. Barber, M. Reimherr, T. Schill, The function-on-scalar LASSO with applications to longitudinal GWAS, Electron. J. Stat., 11 (2017), 1351–1389. https://doi.org/10.1214/17-EJS1260 doi: 10.1214/17-EJS1260
|
| [28] |
A. Mirshani, M. Reimherr, Adaptive function-on-scalar regression with a smoothing elastic net, J. Multivariate Anal., 185 (2021), 104765. https://doi.org/10.1016/j.jmva.2021.104765 doi: 10.1016/j.jmva.2021.104765
|
| [29] |
Z. Wang, J. Magnotti, M. S. Beauchamp, M. Li, Functional group bridge for simultaneous regression and support estimation, Biometrics, 79 (2023), 1226–1238. https://doi.org/10.1111/biom.13684 doi: 10.1111/biom.13684
|
| [30] |
S. Novo, G. Aneiros, P. Vieu, Sparse semiparametric regression when predictors are mixture of functional and high-dimensional variables, Test, 30 (2021), 481–504. https://doi.org/10.3765/salt.v30i0.4830 doi: 10.3765/salt.v30i0.4830
|
| [31] |
J. A. Collazos, R. Dias, A. Z. Zambom, Consistent variable selection for functional regression models, J. Multivariate Anal., 146 (2016), 63–71. https://doi.org/10.1016/j.jmva.2015.06.007 doi: 10.1016/j.jmva.2015.06.007
|
| [32] |
M. Febrero-Bande, W. González-Manteiga, Variable selection in functional additive regression models, Comput. Stat., 34 (2019), 469–487. https://doi.org/10.1007/s00180-018-0844-5 doi: 10.1007/s00180-018-0844-5
|
| [33] |
G. Aneiros, P. Vieu, Sparse nonparametric model for regression with functional covariate, J. Nonparametr. Stat., 28 (2016), 839–859. https://doi.org/10.1080/10485252.2016.1234050 doi: 10.1080/10485252.2016.1234050
|
| [34] |
G. Aneiros, P. Vieu, Partial linear modelling with multi-functional covariates, Comput. Stat., 30 (2015), 647–671. https://doi.org/10.1007/s00180-015-0568-8 doi: 10.1007/s00180-015-0568-8
|
| [35] |
J. Gertheiss, G. Tutz, Sparse modeling of categorial explanatory variables, Ann. Appl. Stat., 60 (2009), 2150–2180. https://doi.org/10.1214/10-AOAS355 doi: 10.1214/10-AOAS355
|
| [36] | D. Hallac, J. Leskovec, S. Boyd, Network Lasso: Clustering and optimization in large graphs, In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015,387–396. https://doi.org/10.1145/2783258.2783313 |
| [37] | A. Jung, N. Tran, A. Mara, When is network lasso accurate? Front. Appl. Math. Stat., 3 (2018), 28. |
| [38] |
J. Leskovec, R. Sosič, Snap: A general-purpose network analysis and graph-mining library, ACM T. Intel. Syst. Tec., 8 (2016), 1–20. https://doi.org/10.1145/2898361 doi: 10.1145/2898361
|
| [39] |
Y. Zhao, E. Zhu, X. Liu, C. Tang, D. Guo, J. Yin, Simultaneous clustering and optimization for evolving datasets, IEEE T. Knowl. Data Eng., 33 (2019), 259–270. https://doi.org/10.1590/s0103-4014.2019.3395.0017 doi: 10.1590/s0103-4014.2019.3395.0017
|
| [40] |
Y. Chen, S. Jewell, D. Witten, More powerful selective inference for the graph fused Lasso, J. Comput. Graph. Stat., 32 (2023), 577–587. https://doi.org/10.1080/10618600.2022.2097246 doi: 10.1080/10618600.2022.2097246
|
| [41] |
P. Ma, W. Zhong, Penalized clustering of large-scale functional data with multiple covariates, J. Am. Stat. Assoc., 103 (2008), 625–636. https://doi.org/10.1198/016214508000000247 doi: 10.1198/016214508000000247
|
| [42] |
X. Qiao, S. Guo, G. M. James, Functional graphical models, J. Am. Stat. Assoc., 114 (2019), 211–222. https://doi.org/10.1080/01621459.2017.1390466 doi: 10.1080/01621459.2017.1390466
|
| [43] |
F. Centofanti, M. Fontana, A. Lepore, S. Vantini, Smooth Lasso estimator for the function-on-function linear regression model, Comput. Stat. Data Anal., 176 (2022), 107556. https://doi.org/10.1016/j.csda.2022.107556 doi: 10.1016/j.csda.2022.107556
|
| [44] |
Z. Ma, X. Hu, A. M. Sayer, R. Levy, Q. Zhang, Y. Xue, et al., Satellite-based spatiotemporal trends in PM2.5 concentrations: China, 2004–2013, Environ. Health Persp., 124 (2016), 184–192. https://doi.org/10.1289/ehp.1409481 doi: 10.1289/ehp.1409481
|
| [45] |
S. Zhang, B. Guo, A. Dong, J. He, Z. Xu, S. X. Chen, Cautionary tales on air-quality improvement in Beijing, P. Roy. Soc. A-Math. Phys., 473 (2017), 20170457. https://doi.org/10.1098/rspa.2017.0457 doi: 10.1098/rspa.2017.0457
|
| [46] | M. Grant, S. Boyd, Cvx: Matlab software for disciplined convex programming, version 2.1, 2014. |
| [47] |
S. Fremdt, L. Horváth, P. Kokoszka, J. G. Steinebach, Functional data analysis with increasing number of projections, J. Multivariate Anal., 124 (2014), 313–332. https://doi.org/10.1016/j.jmva.2013.11.009 doi: 10.1016/j.jmva.2013.11.009
|
| [48] | T. Hsing, R. Eubank, Theoretical foundations of functional data analysis, with an introduction to linear operators, John Wiley & Sons, Chichester, 997 (2015). https://doi.org/10.1002/9781118762547 |
| [49] |
A. Cuevas, A partial overview of the theory of statistics with functional data, J. Stat. Plan. Infer., 147 (2014), 1–23. https://doi.org/10.1016/j.jspi.2013.04.002 doi: 10.1016/j.jspi.2013.04.002
|
| [50] |
X. Zhang, J. L. Wang, From sparse to dense functional data and beyond, Ann. Stat., 44 (2016), 2281–2321. https://doi.org/10.1214/16-AOS1446 doi: 10.1214/16-AOS1446
|
| [51] |
P. Hall, M. Hosseini-Nasab, On properties of functional principal components analysis, J. Roy. Stat. Soc. B, 68 (2006), 109–126. https://doi.org/10.1111/j.1467-9868.2005.00535.x doi: 10.1111/j.1467-9868.2005.00535.x
|
| [52] |
Y. Xiao, L. Chen, D. Li, A generalized alternating direction method of multipliers with semi-proximal terms for convex composite conic programming, Math. Program. Comput., 10 (2018), 533–555. https://doi.org/10.1007/s12532-018-0134-9 doi: 10.1007/s12532-018-0134-9
|
| [53] |
H. Chen, L. Kong, Y. Li, A novel convex clustering method for high-dimensional data using semiproximal ADMM, Math. Probl. Eng., 2020 (2020), 9216351. https://doi.org/10.1155/2020/9216351 doi: 10.1155/2020/9216351
|
| [54] |
M. Fazel, T. K. Pong, D. Sun, P. Tseng, Hankel matrix rank minimization with applications to system identification and realization, SIAM J. Matrix Anal. Appl., 34 (2013), 946–977. https://doi.org/10.1137/110853996 doi: 10.1137/110853996
|
| [55] |
Y. Yang, H. Zou, A fast unified algorithm for solving group-lasso penalize learning problems, Stat. Comput., 25 (2015), 1129–1141. https://doi.org/10.1007/s11222-014-9498-5 doi: 10.1007/s11222-014-9498-5
|
| [56] |
D. R. Roberts, V. Bahn, S. Ciuti, M. S. Boyce, J. Elith, G. Guillera-Arroita, et al., Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure, Ecography, 40 (2017), 913–929. https://doi.org/10.1111/ecog.02881 doi: 10.1111/ecog.02881
|