In this research, we develop a new method for predicting order data. Our approach involves selecting the best-fitting distribution through different tests, estimating its parameters, and constructing prediction intervals that leverage observed and predicted data. In this method, we entered the predicted data one by one, along with the observed data. At each step, we found a suitable distribution and then estimated its parameters and applied the prediction method, such as pivotal quantity and modified least square with cumulative hazard function. We implemented the new method using the R programming language and conducted comparative analyses against several established methods across datasets, encompassing health insurance coverage, glass fiber strength, and COVID-19 recovery rates. The results demonstrated this method's superior performance, particularly in terms of Mean square error (MSE) and coefficient of variation (CV), as well as its ability to predict more data and outperform traditional methods in most scenarios. This method has the ability to obtain a large number of predicted observations to reach about 150% to 200% of the real observations, as explained through a simulation study and real data.
Citation: M. H. Harpy, O. M. Khaled, Mahmoud El-Morshedy, K. S. Khalil. An improvement in predictive modeling techniques with application to pivotal quantity and least square method[J]. AIMS Mathematics, 2025, 10(11): 25639-25666. doi: 10.3934/math.20251136
In this research, we develop a new method for predicting order data. Our approach involves selecting the best-fitting distribution through different tests, estimating its parameters, and constructing prediction intervals that leverage observed and predicted data. In this method, we entered the predicted data one by one, along with the observed data. At each step, we found a suitable distribution and then estimated its parameters and applied the prediction method, such as pivotal quantity and modified least square with cumulative hazard function. We implemented the new method using the R programming language and conducted comparative analyses against several established methods across datasets, encompassing health insurance coverage, glass fiber strength, and COVID-19 recovery rates. The results demonstrated this method's superior performance, particularly in terms of Mean square error (MSE) and coefficient of variation (CV), as well as its ability to predict more data and outperform traditional methods in most scenarios. This method has the ability to obtain a large number of predicted observations to reach about 150% to 200% of the real observations, as explained through a simulation study and real data.
| [1] |
H. M. Barakat, O. M. Khaled, H. A. Ghonem, Predicting future order statistics with random sample size, AIMS Math., 6 (2021), 5133–5147. http://dx.doi.org/10.3934/MATH.2021304 doi: 10.3934/MATH.2021304
|
| [2] |
H. M. Barakat, M. E. El-Adll, A. E. Aly, Prediction intervals of future observations for a sample of random size from any continuous distribution, Math. Comput. Simulat., 97 (2014), 1–13. http://dx.doi.org/10.1016/j.matcom.2013.06.007 doi: 10.1016/j.matcom.2013.06.007
|
| [3] |
J. K. Patel, Prediction intervals–A review, Commun. Stat.-Theor. M., 18 (1989), 2393–2465. http://dx.doi.org/10.1080/03610928908830043 doi: 10.1080/03610928908830043
|
| [4] |
A. E. Aly, M. E. El-Adll, H. M. Barakat, R. A. Aldallal, A new least squares method for estimation and prediction based on the cumulative hazard function, AIMS Math., 8 (2023), 21968–21992. http://dx.doi.org/10.3934/math.20231120 doi: 10.3934/math.20231120
|
| [5] |
O. M. Khaled, K. S. Khalil, M. H. Harpy, Predicting future data from gamma-mixture and beta-mixture distributions and application to the recovery rate of covid-19, Adv. Appl. Stat., 90 (2023), 1–34. https://doi.org/10.17654/0972361723061 doi: 10.17654/0972361723061
|
| [6] |
A. Chakrabarti, J. K. Ghosh, AIC, BIC and recent advances in model selection, Philosophy Stat., 7 (2011), 583–605. https://doi.org/10.1016/B978-0-444-51862-0.50018-6 doi: 10.1016/B978-0-444-51862-0.50018-6
|
| [7] |
M. Rahmouni, D. Ziedan, The weibull-generalized shifted geometric distribution: Properties, estimation, and applications, AIMS Math., 10 (2025), 9773–9804. http://dx.doi.org/10.3934/math.2025448 doi: 10.3934/math.2025448
|
| [8] |
M. E. El-Adll, S. F. Ateya, M. M. Rizk, Prediction intervals for future lifetime of three parameters Weibull observations based on generalized order statistics, Arab. J. Math., 1 (2012), 295–304. http://dx.doi.org/10.1007/s40065-012-0004-7 doi: 10.1007/s40065-012-0004-7
|
| [9] | H. Rinne, The Weibull distribution: A handbook, 1 Eds., New Yourk: Chapman and Hall/CRC, 2008. http://dx.doi.org/10.1201/9781420087444 |
| [10] | A. Karolczuk, T. P. Luc, Modelling of stress gradient effect on fatigue life using weibull based distribution function, J. Theor. Appl. Mech., 51 (2013), 297–311. |
| [11] |
A. A. Al-Babtain, M. K. Shakhatreh, M. Nassar, A. Z. Afify, A new modified kies family: Properties, estimation under complete and type-ii censored samples, and engineering applications, Mathematics, 8 (2020), 1345. http://dx.doi.org/10.3390/math8081345 doi: 10.3390/math8081345
|
| [12] |
D. G. Hoel, A representation of mortality data by competing risks, Biometrics, 28 (1972), 475–488. https://doi.org/10.2307/2556161 doi: 10.2307/2556161
|
| [13] |
A. Elshahhat, H. Rezk, R. Alotaibi, The discrete gompertz-makeham distribution for multidisciplinary data analysis, AIMS Math., 10 (2025), 17117–17178. http://dx.doi.org/10.3934/math.2025768 doi: 10.3934/math.2025768
|
| [14] |
R. Valiollahi, A. Asgharzadeh, D. Kundu, Prediction of future failures for generalized exponential distribution under type-i or type-ii hybrid censoring, Braz. J. Probab. Stat., 31 (2017), 41–61. http://dx.doi.org/10.1214/15-BJPS302 doi: 10.1214/15-BJPS302
|
| [15] | J. F. Lawless, Statistical models and methods for lifetime data, 2 Eds., United State, John Wiley & Sons, 2011. |
| [16] |
R. L. Smith, J. C. Naylor, A comparison of maximum likelihood and bayesian estimators for the three-parameter weibull distribution, J. Roy. Stat. Soc. C-Appl., 36 (1987), 358–369. https://doi.org/10.2307/2347795 doi: 10.2307/2347795
|
| [17] |
Y. Wu, H. Xie, J. Y. Chiang, G. Peng, Y. Qin, Parameter estimation and applications of the Weibull distribution for strength data of glass fiber, Math. Probl. Eng., 2021 (2021), 9175170. http://dx.doi.org/10.1155/2021/9175170 doi: 10.1155/2021/9175170
|
| [18] |
A. H. Haj, E. M. Almetwally, M. Elgarhy, D. A. Ramadan, On unit exponential Pareto distribution for modeling the recovery rate of COVID-19, Processes, 11 (2023), 232. http://dx.doi.org/10.3390/PR11010232 doi: 10.3390/PR11010232
|
| [19] |
W. A. El Azm, R. Aldallal, H. M. Aljohani, S. G. Nassr, Estimations of competing lifetime data from inverse weibull distribution under adaptive progressively hybrid censored, Math. Biosci. Eng., 19 (2022), 6252–6275. http://dx.doi.org/10.3934/mbe.2022292 doi: 10.3934/mbe.2022292
|