The beta regression model (BRM) is a popular and widely applied modeling approach, especially when dealing with data bounded within the interval (0, 1). It has been used extensively in various fields, including chemistry, environmental science, medicine, and biology. BRM aims to estimate unknown model parameters, typically achieved using the maximum likelihood estimator (MLE). However, MLE is not without limitations. It can be highly sensitive to multicollinearity and outliers, which can distort coefficient estimates, lead to misleading conclusions, and inflate variance, ultimately increasing the mean squared error (MSE). To address these challenges, this study proposed new robust estimators for BRM that incorporated robust modified ridge-type estimators. These estimators were specifically designed to reduce the adverse effects of multicollinearity and outliers. Their performance was theoretically compared to that of the traditional MLE and robust ridge estimators. In addition, an extensive simulation study was carried out in various scenarios to evaluate their effectiveness. Both theoretical comparisons and simulation results demonstrated the clear advantages of the proposed robust estimators in managing multicollinearity and handling outliers. To further validate the findings, the estimators were applied to real-world data from breast cancer patients. The results confirmed that the proposed robust estimators offer greater robustness and reliability compared to MLE and robust ridge methods. These findings highlighted the practical importance of using robust estimation techniques to improve the accuracy and dependability of BRMs, particularly in empirical research involving highly multicollinear and outlier data.
Citation: Ali T. Hammad, I. Elbatal, Ehab M. Almetwally, M. M. Abd El-Raouf, M. A. El-Qurashi, Ahmed M. Gemeay. A novel robust estimator for addressing multicollinearity and outliers in Beta regression: simulation and application[J]. AIMS Mathematics, 2025, 10(9): 21549-21580. doi: 10.3934/math.2025958
The beta regression model (BRM) is a popular and widely applied modeling approach, especially when dealing with data bounded within the interval (0, 1). It has been used extensively in various fields, including chemistry, environmental science, medicine, and biology. BRM aims to estimate unknown model parameters, typically achieved using the maximum likelihood estimator (MLE). However, MLE is not without limitations. It can be highly sensitive to multicollinearity and outliers, which can distort coefficient estimates, lead to misleading conclusions, and inflate variance, ultimately increasing the mean squared error (MSE). To address these challenges, this study proposed new robust estimators for BRM that incorporated robust modified ridge-type estimators. These estimators were specifically designed to reduce the adverse effects of multicollinearity and outliers. Their performance was theoretically compared to that of the traditional MLE and robust ridge estimators. In addition, an extensive simulation study was carried out in various scenarios to evaluate their effectiveness. Both theoretical comparisons and simulation results demonstrated the clear advantages of the proposed robust estimators in managing multicollinearity and handling outliers. To further validate the findings, the estimators were applied to real-world data from breast cancer patients. The results confirmed that the proposed robust estimators offer greater robustness and reliability compared to MLE and robust ridge methods. These findings highlighted the practical importance of using robust estimation techniques to improve the accuracy and dependability of BRMs, particularly in empirical research involving highly multicollinear and outlier data.
| [1] |
S. Ferrari, F. Cribari-Neto, Beta regression for modelling rates and proportions, J. Appl. Stat., 31 (2004), 799–815. https://doi.org/10.1080/0266476042000214501 doi: 10.1080/0266476042000214501
|
| [2] |
A. Saboor, F. Jamal, S. Shafiq, R. Mumtaza, On the versatility of the unit logistic exponential distribution: capturing bathtub, upside-down bathtub, and monotonic hazard rates, Innovation in Statistics and Probability, 1 (2025), 28–46. https://doi.org/10.64389/isp.2025.01102 doi: 10.64389/isp.2025.01102
|
| [3] | A. J. Dobson, A. G. Barnett, An introduction to generalized linear models, New York: Chapman and Hall/CRC, 2018. https://doi.org/10.1201/9781315182780 |
| [4] |
A. M. Gemeay, T. Moakofi, O. S. Balogun, E. Ozkan, M. M. Hossain, Analyzing real data by a new heavy-tailed statistical model, Modern Journal of Statistics, 1 (2025), 1–24. https://doi.org/10.64389/mjs.2025.01108 doi: 10.64389/mjs.2025.01108
|
| [5] | R. Frisch, Statistical confluence analysis by means of complete regression systems, In: The foundations of econometric analysis, Cambridge: Cambridge University Press, 1995,271–273. |
| [6] |
B. Segerstedt, On ordinary ridge regression in generalized linear models, Commun. Stat.-Theor. M., 21 (1992), 2227–2246. https://doi.org/10.1080/03610929208830909 doi: 10.1080/03610929208830909
|
| [7] | A. E. Hoerl, R. W. Kennard, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12 (1970), 55–67. |
| [8] |
L. Kejian, A new class of biased estimate in linear regression, Commun. Stat.-Theor. M., 22 (1993), 393–402. https://doi.org/10.1080/03610929308831027 doi: 10.1080/03610929308831027
|
| [9] |
A. F. Lukman, K. Ayinde, S. Binuomote, O. A. Clement, Modified ridge-type estimator to combat multicollinearity: application to chemical data, J. Chemometr., 33 (2019), e3125. https://doi.org/10.1002/cem.3125 doi: 10.1002/cem.3125
|
| [10] |
I. Dawoud, B. M. G. Kibria, A new biased estimator to combat the multicollinearity of the Gaussian linear regression model, Stats, 3 (2020), 526–541. https://doi.org/10.3390/stats3040033 doi: 10.3390/stats3040033
|
| [11] |
M. R. Abonazel, A new biased estimation class to combat the multicollinearity in regression models: modified two-parameter Liu estimator, Computational Journal of Mathematical and Statistical Sciences, 4 (2025), 316–347. https://doi.org/10.21608/cjmss.2025.347818.1096 doi: 10.21608/cjmss.2025.347818.1096
|
| [12] |
M. Norouzirad, M. Arashi, F. J. Marques, N. A. M. Khan, Feasible Stein-type and preliminary test estimations in the system regression model, Statistics, Optimization and Information Computing, 11 (2022), 258–275. https://doi.org/10.19139/soic-2310-5070-1589 doi: 10.19139/soic-2310-5070-1589
|
| [13] |
B. Kibria, A. Lukman, A new ridge-type estimator for the linear regression model: simulations and applications, Scientifica, 2020 (2020), 9758378. https://doi.org/10.1155/2020/9758378 doi: 10.1155/2020/9758378
|
| [14] |
M. Qasim, K. Månsson, B. M. Golam Kibria, On some beta ridge regression estimators: method, simulation and application, J. Stat. Comput. Sim., 91 (2021), 1699–1712. https://doi.org/10.1080/00949655.2020.1867549 doi: 10.1080/00949655.2020.1867549
|
| [15] |
A. Hammad, E. Hafez, U. Shahzad, E. Yıldırım, E. Almetwally, B. Kibria, New modified Liu estimators to handle the multicollinearity in the Beta regression model: simulation and applications, Modern Journal of Statistics, 1 (2025), 58–79. https://doi.org/10.64389/mjs.2025.01111 doi: 10.64389/mjs.2025.01111
|
| [16] |
M. N. Akram, M. Amin, A. Elhassanein, M. A. Ullah, A new modified ridge-type estimator for the beta regression model: simulation and application, AIMS Mathematics, 7 (2021), 1035–1057. https://doi.org/10.3934/math.2022062 doi: 10.3934/math.2022062
|
| [17] |
X. Jiao, F. Pretis, Testing the presence of outliers in regression models, Oxford Bull. Econ. Stat., 84 (2022), 1452–1484. https://doi.org/10.1111/obes.12511 doi: 10.1111/obes.12511
|
| [18] | P. Huber, Robust estimation of a location parameter, In: Breakthroughs in statistics: methodology and distribution, New York: Springer, 1992,492–518. https://doi.org/10.1007/978-1-4612-4380-9_35 |
| [19] |
H. Almongy, E. Almetwally, Robust estimation methods of generalized exponential distribution with outliers, Pak. J. Stat. Oper. Res., 16 (2020), 545–559. https://doi.org/10.18187/pjsor.v16i3.3016 doi: 10.18187/pjsor.v16i3.3016
|
| [20] | F. Hampel, E. Ronchetti, P. Rousseeuw, W. Stahel, Robust statistics: the approach based on influence functions, New York: Wiley, 2005. https://doi.org/10.1002/9781118186435 |
| [21] |
A. Ghosh, Robust inference under the beta regression model with application to health care studies, Stat. Methods Med. Res., 28 (2017), 871–888. https://doi.org/10.1177/0962280217738142 doi: 10.1177/0962280217738142
|
| [22] |
T. K. A. Ribeiro, S. L. P. Ferrari, Robust estimation in beta regression via maximum L q-likelihood, Stat. Papers, 64 (2023), 321–353. https://doi.org/10.1007/s00362-022-01320-0 doi: 10.1007/s00362-022-01320-0
|
| [23] |
J. H. Shih, T. Y. Lin, M. Jimichi, T. Emura, Robust ridge M-estimators with pretest and Stein-rule shrinkage for an intercept term, Jpn. J. Stat. Data Sci., 4 (2021), 107–150. https://doi.org/10.1007/s42081-020-00089-6 doi: 10.1007/s42081-020-00089-6
|
| [24] |
A. F. Lukman, S. Mohammed, O. Olaluwoye, R. A. Farghali, Handling multicollinearity and outliers in logistic regression using the robust Kibria-Lukman estimator, Axioms, 14 (2025), 19. https://doi.org/10.3390/axioms14010019 doi: 10.3390/axioms14010019
|
| [25] |
W. B. Altukhaes, M. Roozbeh, N. A. Mohamed, Feasible robust Liu estimator to combat outliers and multicollinearity effects in restricted semiparametric regression model, AIMS Mathematics, 9 (2024), 31581–31606. https://doi.org/10.3934/math.20241519 doi: 10.3934/math.20241519
|
| [26] |
O. T. Olaluwoye, A. F. Lukman, M. A. Alrasheedi, W. N. Nzomo, R. A. Farghali, Robust estimation methods for addressing multicollinearity and outliers in beta regression models, Sci. Rep., 15 (2025), 11649. https://doi.org/10.1038/s41598-025-85553-7 doi: 10.1038/s41598-025-85553-7
|
| [27] |
H. Mohammad, A. Hammad, A. El-Helbawy, Z. Kalantan, A. Habineza, E. Hussam, et al., New robust two-parameter estimator for overcoming outliers and multicollinearity in Poisson regression model, Sci. Rep., 15 (2025), 27445. https://doi.org/10.1038/s41598-025-12646-8 doi: 10.1038/s41598-025-12646-8
|
| [28] |
A. F. Lukman, A. T. Owolabi, O. O. Akanni, C. Kporxah, R. Farghali, Robust enhanced ridge-type estimation for the Poisson regression models: application to English league football data, Int. J. Uncertain. Fuzz., 32 (2024), 1157–1183. https://doi.org/10.1142/S0218488524500284 doi: 10.1142/S0218488524500284
|
| [29] |
M. Suhail, S. Chand, M. Aslam, New quantile based ridge M-estimator for linear regression models with multicollinearity and outliers, Commun. Stat.-Simul. Comput., 52 (2023), 1417–1434. https://doi.org/10.1080/03610918.2021.1884715 doi: 10.1080/03610918.2021.1884715
|
| [30] |
O. A. Alqasem, A. T. Hammad, A. M. Yousuf, E. Mohamed, A. Haleeb, A. M. Gemeay, A comprehensive study on robust Poisson James-Stein estimator for outlier and multicollinearity: simulation and applications, AIP Adv., 15 (2025), 055232. https://doi.org/10.1063/5.0273948 doi: 10.1063/5.0273948
|
| [31] |
O. Tayo, T. Olatayo, B. Efuwape, Handling multicollinearity and outliers: a comparative study of some one and two-parameter estimators using real-life data, International Journal of Development Mathematics, 1 (2024), 177–190. https://doi.org/10.62054/ijdm/0104.14 doi: 10.62054/ijdm/0104.14
|
| [32] |
A. Majid, S. Ahmad, M. Aslam, M. Kashif, A robust Kibria-Lukman estimator for linear regression model to combat multicollinearity and outliers, Concurr. Comp.-Pract. E., 35 (2023), e7533. https://doi.org/10.1002/cpe.7533 doi: 10.1002/cpe.7533
|
| [33] |
M. Norouzirad, M. Arashi, Preliminary test and Stein-type shrinkage ridge estimators in robust regression, Stat. Papers, 60 (2019), 1849–1882. https://doi.org/10.1007/s00362-017-0899-3 doi: 10.1007/s00362-017-0899-3
|
| [34] |
F. M. Alghamdi, A. T. Hammad, B. M. G. Kibria, G. A. Abd-Elmougod, L. P. Sapkota, A. M. Gemeay, On robust and non-robust modified Liu estimation in Poisson regression model with multicollinearity and outliers, Int. J. Uncertain. Fuzz., 33 (2025), 787–823. https://doi.org/10.1142/S0218488525500266 doi: 10.1142/S0218488525500266
|
| [35] |
M. Abonazel, I. Taha, Beta ridge regression estimators: simulation and application, Commun. Stat.-Simul. Comput., 52 (2023), 4280–4292. https://doi.org/10.1080/03610918.2021.1960373 doi: 10.1080/03610918.2021.1960373
|
| [36] |
A. Hammad, A. Habineza, A. Gemeay, F. Almulhim, Enhancing the accuracy of modeling highly multicollinear CO2 emission data using a novel generalized Poisson Liu regression method, AIP Adv., 15 (2025), 075043. https://doi.org/10.1063/5.0282121 doi: 10.1063/5.0282121
|
| [37] |
G. Guo, H. Song, L. Zhu, The COR criterion for optimal subset selection in distributed estimation, Stat. Comput., 34 (2024), 163. https://doi.org/10.1007/s11222-024-10471-z doi: 10.1007/s11222-024-10471-z
|
| [38] |
A. F. Lukman, B. Aladeitan, K. Ayinde, M. R. Abonazel, Modified ridge-type for the Poisson regression model: simulation and application, J. Appl. Stat., 49 (2022), 2124–2136. https://doi.org/10.1080/02664763.2021.1889998 doi: 10.1080/02664763.2021.1889998
|
| [39] |
L. P. Sapkota, V. Kumar, G. Tekle, H. Alrweili, M. S. Mustafa, M. Yusuf, Fitting real data sets by a new version of Gompertz distribution, Modern Journal of Statistics, 1 (2025), 25–48. https://doi.org/10.64389/mjs.2025.01109 doi: 10.64389/mjs.2025.01109
|
| [40] |
C. K. Onyekwere, O. C. Aguwa, O. J. Obulezi, An updated Lindley distribution: properties, estimation, acceptance sampling, actuarial risk assessment and applications, Innovation in Statistics and Probability, 1 (2025), 1–27. https://doi.org/10.64389/isp.2025.01103 doi: 10.64389/isp.2025.01103
|
| [41] |
M. Abo El Nasr, A. Abdelmegaly, D. Abdo, Performance evaluation of different regression models: application in a breast cancer patient data, Sci. Rep., 14 (2024), 12986. https://doi.org/10.1038/s41598-024-62627-6 doi: 10.1038/s41598-024-62627-6
|
| [42] |
O. Rahmashari, W. Srisodaphol, Advanced outlier detection methods for enhancing beta regression robustness, Decision Analytics Journal, 14 (2025), 100557. https://doi.org/10.1016/j.dajour.2025.100557 doi: 10.1016/j.dajour.2025.100557
|