Research article Special Issues

Novel robust logistic regression estimators for effectively modeling of multivariate binary data under outliers and multicollinearity: application to heavy metal contamination data in Al-Kharj landfills

  • Published: 05 June 2026
  • MSC : 62J07, 62J10, 62J12, 62P12

  • Logistic regression models are widely used for analyzing binary data, with the maximum likelihood estimator (MLE) being the standard method to estimate coefficients. However, the MLE becomes unstable and unreliable in the presence of multicollinearity or outliers. Outliers distort parameter estimates by unduly influencing the likelihood function, leading to bias and poor prediction. Multicollinearity inflates the variance of coefficients, reducing stability and interpretability. While biased estimators exist for multicollinearity and robust estimators for outliers, a unified framework that simultaneously handles both issues is still lacking. To address these issues, we have proposed a class of robust ridge-type estimators that combine robust logistic estimation with shrinkage methods. A comprehensive Monte Carlo simulation study evaluated the proposed estimators under varying levels of outliers and multicollinearity. Results show that our methods consistently outperform the traditional MLE and existing estimators in terms of accuracy and robustness. Finally, we demonstrated practical utility by analyzing heavy metal and metalloid contamination levels in landfill sites in Al-Kharj, Saudi Arabia, with empirical findings confirming that the proposed robust logistic estimators provide reliable and efficient inference when both multicollinearity and outliers are present.

    Citation: Eslam Hussam, Yousef Alharbi, Ahmed M. Gemeay, Samirah Alzubaidi, M. H. Harpy, Ramy Aldallal, M. S. Mohamed, Ali T. Hammad. Novel robust logistic regression estimators for effectively modeling of multivariate binary data under outliers and multicollinearity: application to heavy metal contamination data in Al-Kharj landfills[J]. AIMS Mathematics, 2026, 11(6): 16095-16128. doi: 10.3934/math.2026662

    Related Papers:

  • Logistic regression models are widely used for analyzing binary data, with the maximum likelihood estimator (MLE) being the standard method to estimate coefficients. However, the MLE becomes unstable and unreliable in the presence of multicollinearity or outliers. Outliers distort parameter estimates by unduly influencing the likelihood function, leading to bias and poor prediction. Multicollinearity inflates the variance of coefficients, reducing stability and interpretability. While biased estimators exist for multicollinearity and robust estimators for outliers, a unified framework that simultaneously handles both issues is still lacking. To address these issues, we have proposed a class of robust ridge-type estimators that combine robust logistic estimation with shrinkage methods. A comprehensive Monte Carlo simulation study evaluated the proposed estimators under varying levels of outliers and multicollinearity. Results show that our methods consistently outperform the traditional MLE and existing estimators in terms of accuracy and robustness. Finally, we demonstrated practical utility by analyzing heavy metal and metalloid contamination levels in landfill sites in Al-Kharj, Saudi Arabia, with empirical findings confirming that the proposed robust logistic estimators provide reliable and efficient inference when both multicollinearity and outliers are present.



    加载中


    [1] T. G. Nick, K. M. Campbell, Logistic regression, In: W. T. Ambrosius, Topics in biostatistics, Springer, 2007,273–301. https://doi.org/10.1007/978-1-59745-530-5_14
    [2] M. Jain, A. Srihari, Comparison of machine learning algorithm in intrusion detection systems: a review using binary logistic regression, Authorea Preprints, 2025.
    [3] H. Hocek, S. Yay, D. Yazir, Comprehensive analysis of ship detention probabilities using binary logistic regression method with machine learning, Ocean Eng., 315 (2025), 119889. https://doi.org/10.1016/j.oceaneng.2024.119889 doi: 10.1016/j.oceaneng.2024.119889
    [4] D. Dey, M. S. Haque, M. M. Islam, U. I. Aishi, S. S. Shammy, M. S. A. Mayen, et al., The proper application of logistic regression model in complex survey data: a systematic review, BMC Med. Res. Methodol., 25 (2025), 15. https://doi.org/10.1186/s12874-024-02454-5 doi: 10.1186/s12874-024-02454-5
    [5] H. Hasim, M. Salam, A. A. Sulaiman, M. H. Jamil, H. Iswoyo, P. Diansari, et al., Employing binary logistic regression in modeling the effectiveness of agricultural extension in clove farming: facts and findings from sidrap regency, Indonesia, Sustainability, 17 (2025), 2786.
    [6] M. N. Akram, M. Amin, A. Elhassanein, M. A. Ullah, A new modified ridge-type estimator for the beta regression model: simulation and application, AIMS Math., 7 (2022), 1035–1057. https://doi.org/10.3934/math.2022062 doi: 10.3934/math.2022062
    [7] R. Alharbi, A. S. Alghamdi, A new almost unbiased estimator for beta regression model under multicollinearity, AIMS Math., 11 (2026), 85–126. https://doi.org/10.3934/math.2026005 doi: 10.3934/math.2026005
    [8] R. L. Schaefer, L. D. Roi, R. A. Wolfe, A ridge logistic estimator, Commun. Stat., 13 (1984), 99–113. https://doi.org/10.1080/03610928408828664 doi: 10.1080/03610928408828664
    [9] K. Månsson, B. M. G. Kibria, G. Shukur, On Liu estimators for the logit regression model, Econ. Model., 29 (2012), 1483–1488. https://doi.org/10.1016/j.econmod.2011.11.015 doi: 10.1016/j.econmod.2011.11.015
    [10] M. K. Çetinkaya, A new estimator for the multicollinear logistic regression model, Stat. Pap., 66 (2025), 55. https://doi.org/10.1007/s00362-025-01675-0 doi: 10.1007/s00362-025-01675-0
    [11] O. J. Oladapo, O. O. Alabi, K. Ayinde, Another new two parameter estimator in dealing with multicollinearity in the logistic regression model, Int. J. Math. Sci. Optim., 10 (2024), 22–35. https://doi.org/10.5281/zenodo.10937145 doi: 10.5281/zenodo.10937145
    [12] A. F. Lukman, B. M. G. Kibria, C. K. Nziku, M. Amin, E. T. Adewuyi, R. Farghali, KL estimator: dealing with multicollinearity in the logistic regression model, Mathematics, 11 (2023), 340. https://doi.org/10.3390/math11020340 doi: 10.3390/math11020340
    [13] F. A. Awwad, K. A. Odeniyi, I. Dawoud, Z. Y. Algamal, M. R. Abonazel, B. M. G. Kibria, et al., New two-parameter estimators for the logistic regression model with multicollinearity, WSEAS Trans. Math., 21 (2022), 403–414. https://doi.org/10.37394/23206.2022.21.48 doi: 10.37394/23206.2022.21.48
    [14] P. Rousseeuw, V. Yohai, Robust regression by means of S-estimators, In: J. Franke, W. Härdle, D. Martin, Robust and nonlinear time series analysis, Springer, 1984,256–272. https://doi.org/10.1007/978-1-4615-7821-5_15
    [15] J. Feng, H. Xu, S. Mannor, S. Yan, Robust logistic regression and classification, Adv. Neural Inf. Process. Syst., 27 (2014), 253–261.
    [16] A. M. Bianco, V. J. Yohai, Robust estimation in the logistic regression model, In: H. Rieder, Robust statistics, data analysis, and computer intensive methods, Springer, 1996, 17–34. https://doi.org/10.1007/978-1-4612-2380-1_2
    [17] R. J. Carroll, S. Pederson, On robustness in the logistic regression model, J. R. Stat. Soc., 55 (1993), 693–706. https://doi.org/10.1111/j.2517-6161.1993.tb01934.x doi: 10.1111/j.2517-6161.1993.tb01934.x
    [18] A. Hakimi, A. Amiri, R. Kamranrad, Robust approaches for monitoring logistic regression profiles under outliers, Int. J. Qual. Reliab. Manage., 34 (2017), 494–507. https://doi.org/10.1108/IJQRM-04-2015-0053 doi: 10.1108/IJQRM-04-2015-0053
    [19] B. Akturk, U. Beyaztas, H. L. Shang, A. Mandal, Robust functional logistic regression, Adv. Data Anal. Classif., 19 (2025), 121–145. https://doi.org/10.1007/s11634-023-00577-z doi: 10.1007/s11634-023-00577-z
    [20] A. M. Bianco, G. Boente, G. Chebi, Penalized robust estimators in sparse logistic regression, Test, 31 (2022), 563–594. https://doi.org/10.1007/s11749-021-00792-w doi: 10.1007/s11749-021-00792-w
    [21] B. Shin, S. Lee, Robust logistic regression with shift parameter estimation, J. Stat. Comput. Simul., 93 (2023), 2625–2641. https://doi.org/10.1080/00949655.2023.2201008 doi: 10.1080/00949655.2023.2201008
    [22] I. A. I. Ahmed, W. Cheng, The performance of robust methods in logistic regression model, Open J. Stat., 10 (2020), 127. https://doi.org/10.4236/ojs.2020.101010 doi: 10.4236/ojs.2020.101010
    [23] A. Ostovar, D. D. Davari, M. Dzikuc, Determinants of design with multilayer perceptron neural networks: a comparison with logistic regression, Sustainability, 17 (2025), 2611. https://doi.org/10.3390/su17062611 doi: 10.3390/su17062611
    [24] A. F. Lukman, S. Mohammed, O. Olaluwoye, R. A. Farghali, Handling multicollinearity and outliers in logistic regression using the robust Kibria–Lukman estimator, Axioms, 14 (2024), 19. https://doi.org/10.3390/axioms14010019 doi: 10.3390/axioms14010019
    [25] K. C. Arum, F. I. Ugwuowo, H. E. Oranye, T. O. Alakija, T. E. Ugah, O. C. Asogwa, Combating outliers and multicollinearity in linear regression model using robust Kibria-Lukman mixed with principal component estimator, simulation and computation, Sci. Afr., 19 (2023), e01566. https://doi.org/10.1016/j.sciaf.2023.e01566 doi: 10.1016/j.sciaf.2023.e01566
    [26] H. H. Mohammad, A. T. Hammad, A. A. El-Helbawy, Z. I. Kalantan, A. Habineza, E. Hussam, et al., New robust two-parameter estimator for overcoming outliers and multicollinearity in Poisson regression model, Sci. Rep., 15 (2025), 27445. https://doi.org/10.1038/s41598-025-12646-8 doi: 10.1038/s41598-025-12646-8
    [27] A. T. Hammad, I. Elbatal, E. M. Almetwally, M. M. A. El-Raouf, M. A. El-Qurashi, A. M. Gemeay, A novel robust estimator for addressing multicollinearity and outliers in Beta regression: simulation and application, AIMS Math., 10 (2025), 21549–21580. https://doi.org/10.3934/math.2025958 doi: 10.3934/math.2025958
    [28] A. M. Alshangiti, A. M. Gemeay, M. E. Bakr, A. M. Yousuf, M. A. El-Qurashi, O. S. Balogun, et al., New robust estimator for handling outliers and multicollinearity in gamma regression model with application to breast cancer data, Sci. Rep., 15 (2025), 38436. https://doi.org/10.1038/s41598-025-25231-w doi: 10.1038/s41598-025-25231-w
    [29] W. B. Altukhaes, M. Roozbeh, N. A. Mohamed, Feasible robust Liu estimator to combat outliers and multicollinearity effects in restricted semiparametric regression model, AIMS Math., 9 (2024), 31581–31606. https://doi.org/10.3934/math.20241519 doi: 10.3934/math.20241519
    [30] D. Pregibon, Resistant fits for some commonly used logistic models with medical applications, Biometrics, 38 (1982), 485–498. https://doi.org/10.2307/2530463 doi: 10.2307/2530463
    [31] H. R. Künsch, L. A. Stefanski, R. J. Carroll, Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models, J. Amer. Stat. Assoc., 84 (1989), 460–466. https://doi.org/10.1080/01621459.1989.10478791 doi: 10.1080/01621459.1989.10478791
    [32] A. E. Hoerl, R. W. Kennard, Ridge regression: biased estimation for nonorthogonal problems, Technometrics, 12 (1970), 55–67.
    [33] A. F. Lukman, K. Ayinde, S. Binuomote, O. A. Clement, Modified ridge-type estimator to combat multicollinearity: Application to chemical data, J. Chemom., 33 (2019), e3125. https://doi.org/10.1002/cem.3125 doi: 10.1002/cem.3125
    [34] B. M. G. Kibria, A. F. Lukman, A new ridge-type estimator for the linear regression model: simulations and applications, Scientifica, 2020 (2020), 9758378. https://doi.org/10.1155/2020/9758378 doi: 10.1155/2020/9758378
    [35] G. Khalaf, G. Shukur, Choosing ridge parameter for regression problems, Commun. Stat., 34 (2005), 1177–1182. https://doi.org/10.1081/STA-200056836 doi: 10.1081/STA-200056836
    [36] A. T. Hammad, E. H. Hafez, U. Shahzad, E. Yıldırım, E. M. Almetwally, B. M. G. Kibria, New modified Liu estimators to handle the multicollinearity in the beta regression model: simulation and applications, Mod. J. Stat., 1 (2025), 58–79. https://doi.org/10.64389/mjs.2025.01111 doi: 10.64389/mjs.2025.01111
    [37] O. T. Olaluwoye, A. F. Lukman, M. A. Alrasheedi, W. N. Nzomo, R. A. Farghali, Robust estimation methods for addressing multicollinearity and outliers in beta regression models, Sci. Rep., 15 (2025), 11649. https://doi.org/10.1038/s41598-025-85553-7 doi: 10.1038/s41598-025-85553-7
    [38] A. T. Hammad, I. Elbatal, O. A. Alqasem, E. M. Almetwally, A. M. A. Haleeb, A. M. Yousuf, et al., New robust ridge-type estimators for beta regression under multicollinearity and outliers with application to breast cancer data, Appl. Math. Sci. Eng., 33 (2025), 2590432. https://doi.org/10.1080/27690911.2025.2590432 doi: 10.1080/27690911.2025.2590432
    [39] T. Alharbi, A. S. El-Sorogy, N. Rikan, A GIS and multivariate analysis approach for mapping heavy metals and metalloids contamination in landfills: a case study from Al-Kharj, Saudi Arabia, Land, 14 (2025), 1697. https://doi.org/10.3390/land14081697 doi: 10.3390/land14081697
  • Reader Comments
  • © 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(163) PDF downloads(36) Cited by(0)

Article outline

Figures and Tables

Figures(7)  /  Tables(9)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog