Research article Special Issues

A weighted hybrid discrete probability model: Mathematical framework, statistical analysis, estimation techniques, simulation-based ranking, and goodness-of-fit evaluation for over-dispersed data

  • Published: 14 April 2025
  • Data modeling played a crucial role in a variety of research domains due to its widespread practical applications, especially when handling complex datasets. This study explored a specific discrete distribution, characterized by a single parameter, developed using the weighted combining discretization method. The statistical properties of this distribution were rigorously derived and expressed mathematically, covering essential aspects such as moments, skewness, kurtosis, covariance, index of dispersion, order statistics, entropies, mean residual life, residual coefficient of variation function, stress-strength models, and premium principles. These properties highlighted the model's suitability for analyzing right-skewed data with heavy tails, making it a powerful tool for probabilistic modeling in situations where data exhibited overdispersion and increasing failure rates. The research introduced a range of estimation techniques, including maximum product of spacings, method of moments, Anderson-Darling, right-tail Anderson-Darling, maximum likelihood, least squares, weighted least squares, Cramer-Von-Mises, and percentile, each explained in detail. A ranking simulation study was performed to assess the performance of these estimators, with ranking techniques used to determine the most effective estimator across various sample sizes. The study further applied the proposed model to real-world datasets, demonstrating its ability to address complex data scenarios and showcasing its superior performance in comparison to traditional models such as the geometric, Poisson, and negative binomial distributions. Overall, the results emphasized the proposed model's potential as a versatile and effective tool for modeling over-dispersed and skewed data, with promising implications for future research in diverse fields.

    Citation: Mahmoud El-Morshedy, Mohamed S. Eliwa, Mohamed El-Dawoody, Hend S. Shahen. A weighted hybrid discrete probability model: Mathematical framework, statistical analysis, estimation techniques, simulation-based ranking, and goodness-of-fit evaluation for over-dispersed data[J]. Electronic Research Archive, 2025, 33(4): 2061-2091. doi: 10.3934/era.2025091

    Related Papers:

  • Data modeling played a crucial role in a variety of research domains due to its widespread practical applications, especially when handling complex datasets. This study explored a specific discrete distribution, characterized by a single parameter, developed using the weighted combining discretization method. The statistical properties of this distribution were rigorously derived and expressed mathematically, covering essential aspects such as moments, skewness, kurtosis, covariance, index of dispersion, order statistics, entropies, mean residual life, residual coefficient of variation function, stress-strength models, and premium principles. These properties highlighted the model's suitability for analyzing right-skewed data with heavy tails, making it a powerful tool for probabilistic modeling in situations where data exhibited overdispersion and increasing failure rates. The research introduced a range of estimation techniques, including maximum product of spacings, method of moments, Anderson-Darling, right-tail Anderson-Darling, maximum likelihood, least squares, weighted least squares, Cramer-Von-Mises, and percentile, each explained in detail. A ranking simulation study was performed to assess the performance of these estimators, with ranking techniques used to determine the most effective estimator across various sample sizes. The study further applied the proposed model to real-world datasets, demonstrating its ability to address complex data scenarios and showcasing its superior performance in comparison to traditional models such as the geometric, Poisson, and negative binomial distributions. Overall, the results emphasized the proposed model's potential as a versatile and effective tool for modeling over-dispersed and skewed data, with promising implications for future research in diverse fields.



    加载中


    [1] T. Nakagawa, S. Osaki, The discrete Weibull distribution, IEEE Trans. Reliab., 24 (1975), 300–301. https://doi.org/10.1109/TR.1975.5214915 doi: 10.1109/TR.1975.5214915
    [2] K. B. Kulasekera, D. W. Tonkyn, A new discrete distribution, with applications to survival, dispersal and dispersion, Commun. Stat.- Simul. Comput., 21 (1992), 499–518. https://doi.org/10.1080/03610919208813032 doi: 10.1080/03610919208813032
    [3] H. Sato, M. Ikota, A. Sugimoto, H. Masuda, A new defect distribution metrology with a consistent discrete exponential formula and its applications, IEEE Trans. Semicond. Manuf., 12 (1999), 409–418. https://doi.org/10.1109/66.806118 doi: 10.1109/66.806118
    [4] J. D. Smith, A review of Finn, Fischer, and Handler (Eds.), collaborative/therapeutic assessment: A casebook and guide, JPA, 95 (2012), 234–235. https://doi.org/10.1080/00223891.2012.730086
    [5] M. Roederer, A. Treister, W. Moore, L. A. Herzenberg, Probability binning comparison: A metric for quantitating univariate distribution differences, Cytometry, 45 (2001), 37–46. https://doi.org/10.1002/1097-0320(20010901)45:1<37::AID-CYTO1142>3.0.CO;2-E doi: 10.1002/1097-0320(20010901)45:1<37::AID-CYTO1142>3.0.CO;2-E
    [6] A. Barbiero, A. Hitaj, Discrete approximations of continuous probability distributions obtained by minimizing Cramer-von Mises-type distances, Stat. Papers, 64 (2023), 1669–1697. https://doi.org/10.1007/s00362-022-01356-2 doi: 10.1007/s00362-022-01356-2
    [7] T. Ghosh, D. Roy, N. K. Chandra, Reliability approximation through the discretization of random variables using reversed hazard rate function, Int. J. Math. Comput. Stat. Nat. Phys. Eng., 7 (2013), 96–100.
    [8] S. Chakraborty, Generating discrete analogues of continuous probability distributions-A survey of methods and constructions, J. Stat. Distrib. Appl., 2 (2015), 6. https://doi.org/10.1186/s40488-015-0028-6 doi: 10.1186/s40488-015-0028-6
    [9] S. Kotsiantis, D. Kanellopoulos, Discretization techniques: A recent survey, GESTS Int. Trans. Comput. Sci. Eng., 32 (2006), 47–58.
    [10] G. Casella, R. L. Berger, Statistical Inference Vol. 70, Duxbury Press, 1990. Available from: https://philpapers.org/rec/CASSIV.
    [11] C. M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006. Available from: https://link.springer.com/book/9780387310732#bibliographic-information.
    [12] A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, D. B. Rubin, Bayesian Data Analysis, CRC Press, 2013. https://doi.org/10.1201/b16018
    [13] R. E. Barlow, Statistical theory of reliability and life testing, 1975. https://cir.nii.ac.jp/crid/1571980074720917504
    [14] C. D. Lai, M. Xie, Stochastic Ageing and Dependence for Reliability, Springer, 2006. https://doi.org/10.1007/0-387-34232-X
    [15] B. Singh, R. P. Singh, A. S. Nayal, A. Tyagi, Discrete inverted Nadarajah-Haghighi distribution: Properties and classical estimation with application to complete and censored data, Stat. Optim. Inf. Comput., 10 (2022), 1293–1313. https://doi.org/10.19139/soic-2310-5070-1365 doi: 10.19139/soic-2310-5070-1365
    [16] D. Roy, Discrete rayleigh distribution, IEEE Trans. Reliab., 53 (2004), 255–260. https://doi.org/10.1109/TR.2004.829161 doi: 10.1109/TR.2004.829161
    [17] T. Hussain, M. Ahmad, Discrete inverse Rayleigh distribution, Pak. J. Stat., 30 (2014).
    [18] S. D. Poisson, Probabilité des Jugements en Matière Criminelle et en Matière Civile, Précédées des Règles Générales du Calcul des Probabilités, Paris, France: Bachelier, 1837.
    [19] M. El-Morshedy, M. S. Eliwa, E. Altun, Discrete Burr-Hatke distribution with properties, estimation methods and regression model, IEEE Access, 8 (2020), 74359–74370. https://doi.org/10.1109/ACCESS.2020.2988431 doi: 10.1109/ACCESS.2020.2988431
    [20] H. Krishna, P. S. Pundir, Discrete Burr and discrete Pareto distributions, Stat. Methodol., 6 (2009), 177–188. https://doi.org/10.1016/j.stamet.2008.07.001 doi: 10.1016/j.stamet.2008.07.001
    [21] D. J. Hand, F. Daly, K. J. McConway, A. D. Lunn, E. O. Ostrowski, A Hand Book of Small Data Sets, Chapman and Hall/CRC, 1993. https://doi.org/10.1201/9780429246579
    [22] J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley & Sons, 2011.
    [23] P. Damien, S. Walker, A Bayesian non-parametric comparison of two treatments, Scand. J. Stat., 29 (2002), 51–56. https://doi.org/10.1111/1467-9469.00891 doi: 10.1111/1467-9469.00891
  • Reader Comments
  • © 2025 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1277) PDF downloads(53) Cited by(0)

Article outline

Figures and Tables

Figures(12)  /  Tables(16)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog