The probability proportional to size (PPS) sampling method is frequently applied in engineering and industrial surveys when the auxiliary size measures can be obtained, and the population units are substantially heterogeneous. Nevertheless, classical PPS designs estimators are very susceptible to outliers and influential observations, which are common in engineering data because of measurement errors, extreme operating conditions, or structural variability. This research paper presents a powerful estimation method of the finite population mean using PPS sampling on auxiliary information. The proposed estimator has robust influence functions in the design-based PPS framework, enabling one to balance the impacts of extreme sample units by means of controlled weighting without breaking the sampling design on which it relies. Theoretical properties, such as design unbiasedness, consistency, and asymptotic variance of the estimator, are explored and compared with the traditional PPS estimators. Numerical findings show that the robust estimator significantly decreases the mean squared error compared to existing counterparts. The practical usefulness of the suggested methodology is demonstrated on an engineering dataset on the topic of operational performance measurements. Empirical findings prove the effectiveness of the robust estimator in delivering stable and reliable population estimates as opposed to the conventional PPS-based approaches. The results show that the concept of robustness remains critical in survey estimation when applied to engineering problems and can offer a convenient analysis device to analysts who need to work with heterogeneous and contaminated data. The proposed method provides an effective and flexible alternative to strong population estimation where the PPS sampling designs are applicable.
Citation: L. S. Diab, Norah D. Alshahrani, Majdah Mohammed Badr, Sohaib Ahmad. Robust population estimation under PPS sampling: An application to engineering data[J]. AIMS Mathematics, 2026, 11(5): 13008-13022. doi: 10.3934/math.2026535
The probability proportional to size (PPS) sampling method is frequently applied in engineering and industrial surveys when the auxiliary size measures can be obtained, and the population units are substantially heterogeneous. Nevertheless, classical PPS designs estimators are very susceptible to outliers and influential observations, which are common in engineering data because of measurement errors, extreme operating conditions, or structural variability. This research paper presents a powerful estimation method of the finite population mean using PPS sampling on auxiliary information. The proposed estimator has robust influence functions in the design-based PPS framework, enabling one to balance the impacts of extreme sample units by means of controlled weighting without breaking the sampling design on which it relies. Theoretical properties, such as design unbiasedness, consistency, and asymptotic variance of the estimator, are explored and compared with the traditional PPS estimators. Numerical findings show that the robust estimator significantly decreases the mean squared error compared to existing counterparts. The practical usefulness of the suggested methodology is demonstrated on an engineering dataset on the topic of operational performance measurements. Empirical findings prove the effectiveness of the robust estimator in delivering stable and reliable population estimates as opposed to the conventional PPS-based approaches. The results show that the concept of robustness remains critical in survey estimation when applied to engineering problems and can offer a convenient analysis device to analysts who need to work with heterogeneous and contaminated data. The proposed method provides an effective and flexible alternative to strong population estimation where the PPS sampling designs are applicable.
| [1] |
A. R. El-Saeed, S Ahmad, B. Aloraini, Robust estimator for estimation of population mean under PPS sampling: Application to radiation data, J. Radiat. Res. Appl. Sci., 18 (2025), 101384. https://doi.org/10.1016/j.jrras.2025.101384 doi: 10.1016/j.jrras.2025.101384
|
| [2] |
S. Khan, M. Farooq, Ahmad, S. Khan, Improved Estimator for the Estimation of Population Mean Using a Predictive Approach Under PPS Sampling, VFAST Trans. Math., 12 (2024), 01–16. https://doi.org/10.21015/vtm.v12i2.1942 doi: 10.21015/vtm.v12i2.1942
|
| [3] |
M. Azeem, S. Iftikhar, M. Ijaz, N. Salahuddin, M. Ilyas, An improved estimator of population mean under PPS sampling with application to radiation data sets, J. Radiat. Res. Appl. Sci., 18 (2025), 101543. https://doi.org/10.1016/j.jrras.2025.101543 doi: 10.1016/j.jrras.2025.101543
|
| [4] |
J. Wang, S. Ahmad, M. Arslan, S. A. Lone, A. H. Abd Ellah, M. A. Aldahlan, et al., Estimation of finite population mean using double sampling under probability proportional to size sampling in the presence of extreme values, Heliyon, 9 (2023), e21418. https://doi.org/10.1016/j.heliyon.2023.e21418 doi: 10.1016/j.heliyon.2023.e21418
|
| [5] |
S. Ahmad, J. Shabbir, E. Zahid, M. Aamir, Improved family of estimators for the population mean using supplementary variables under PPS sampling, Sci. Prog., 106 (2023), 00368504231180085. https://doi.org/10.1177/00368504231180085 doi: 10.1177/00368504231180085
|
| [6] | H. Zheng, R. J. Little, Penalized spline model-based estimation of the finite populations total from probability-proportional-to-size samples, J. Off. Stat., 19 (2003), 99. |
| [7] | R. R. Sinha, B. Khanna, Estimation of ratio and product of two population means under pps sampling with and without measurement error, In: Statistical Modeling and Applications on Real-Time Problems, CRC Press, 2024, 34–59. https://doi.org/10.1201/9781003481263-3 |
| [8] |
M. Hussein Mohamud, F. A. Mohamud, Estimation of the mean using robust regression and probability proportional to size sampling, Statistical Theory Relat. Fields, 9 (2025), 213–222. https://doi.org/10.1080/24754269.2025.2516339 doi: 10.1080/24754269.2025.2516339
|
| [9] | R. Latpate, J. Kshirsagar, V. Kumar Gupta, G. Chandra, Probability proportional to size sampling, In: Advanced sampling methods, Singapore: Springer, 2021, 85–98. https://doi.org/10.1007/978-981-16-0622-9_7 |
| [10] |
R. R. Sinha, B. Khanna, Estimation of population mean under probability proportional to size sampling with and without measurement errors, Concurrency Comput.: Pract. Exper., 34 (2022), e7023. https://doi.org/10.1002/cpe.7023 doi: 10.1002/cpe.7023
|
| [11] |
S. Ahmad, J. Shabbir, E. Zahid, M. Aamir, M. Alqawba, New generalized class of estimators for estimation of finite population mean based on probability proportional to size sampling using two auxiliary variables: A simulation study, Sci. Prog., 106 (2023), 00368504231208537. https://doi.org/10.1177/00368504231208537 doi: 10.1177/00368504231208537
|
| [12] |
S. Shah, E. Mahmoudi, H. Iftikhar, P. C. Rodrigues, R. I. Gonzales Medina, J. L. López-Gonzales, A Novel Family of CDF Estimators Under PPS Sampling: Computational, Theoretical, and Applied Perspectives, Axioms, 14 (2025), 796. https://doi.org/10.3390/axioms14110796 doi: 10.3390/axioms14110796
|
| [13] | A. Verma, Clustering-Engineering College Data, 2020. Available from: https://www.kaggle.com/datasets/ankitverma2010/clustering-engineering-college-data?select = Engg_College_Data.csv. |
| [14] | V. Shanawad, Civil Engineering: Cement Manufacturing Dataset, 2021. Available from: https://www.kaggle.com/datasets/vinayakshanawad/cement-manufacturing-concrete-dataset. |
| [15] | S. Dalvi, A Dataset for Exploratory Data Analysis, Feature Engineering, 2025. Available from: https://www.kaggle.com/datasets/samikshadalvi/pcos-diagnosis-dataset. |