On the reproducibility of survival quantile decisions beyond Greenwood-based precision

Norah D. Alshahrani; Norah D. Alshahrani

doi:10.3934/math.2026379

AIMS Mathematics

2026, Volume 11, Issue 4: 9191-9209. doi: 10.3934/math.2026379

Previous Article Next Article

Research article Special Issues

On the reproducibility of survival quantile decisions beyond Greenwood-based precision

Norah D. Alshahrani ^,

Department of Mathematics, College of Science, University of Bisha, P.O. Box 551, Bisha 61922, Bisha, Saudi Arabia

Received: 14 February 2026 Revised: 26 March 2026 Accepted: 31 March 2026 Published: 03 April 2026
MSC : 62F03, 62F12, 62G10, 62G20, 62N01

Greenwood-based confidence intervals are widely used to quantify uncertainty in survival quantile estimation based on the Kaplan-Meier estimator, and narrow intervals are often interpreted as evidence of stable and reliable inference. However, such numerical precision does not directly address the reproducibility of inferential conclusions under repeated sampling. The relationship between Greenwood-based confidence-interval precision and reproducibility in survival quantile inference is investigated. Reproducibility is quantified using reproducibility probability (RP), defined as the probability that a survival quantile estimate is reproduced within a specified tolerance under repeated sampling, along with its decision-based analogue RP$ (D) $ for two-group survival comparisons. Both measures are estimated via a nonparametric bootstrap framework under a fixed study design and sample size. Extensive simulations are conducted for single-group and two-group settings under exponential, Weibull, and lognormal survival distributions, with independent and dependent right-censoring. The results show that Greenwood-based confidence interval width is not a reliable indicator of reproducibility: Narrow intervals may coexist with low RP, whereas wider intervals may be associated with high RP, depending on the distribution, censoring mechanism, and inferential target. In two-group comparisons, decision reproducibility is driven primarily by the stability of the ordering between group-specific quantiles rather than by the numerical precision of individual estimates, and under dependent censoring, decision reproducibility can be high even when confidence intervals are wide. These findings highlight a fundamental distinction between numerical precision and inferential reproducibility in survival analysis and underscore the need to assess reproducibility alongside conventional confidence-interval reporting.
- survival analysis,
- Kaplan-Meier estimator,
- Greenwood variance,
- reproducibility probability,
- survival quantiles,
- decision reproducibility,
- censoring mechanisms
Citation: Norah D. Alshahrani. On the reproducibility of survival quantile decisions beyond Greenwood-based precision[J]. AIMS Mathematics, 2026, 11(4): 9191-9209. doi: 10.3934/math.2026379

Related Papers:

Abstract

Greenwood-based confidence intervals are widely used to quantify uncertainty in survival quantile estimation based on the Kaplan-Meier estimator, and narrow intervals are often interpreted as evidence of stable and reliable inference. However, such numerical precision does not directly address the reproducibility of inferential conclusions under repeated sampling. The relationship between Greenwood-based confidence-interval precision and reproducibility in survival quantile inference is investigated. Reproducibility is quantified using reproducibility probability (RP), defined as the probability that a survival quantile estimate is reproduced within a specified tolerance under repeated sampling, along with its decision-based analogue RP$ (D) $ for two-group survival comparisons. Both measures are estimated via a nonparametric bootstrap framework under a fixed study design and sample size. Extensive simulations are conducted for single-group and two-group settings under exponential, Weibull, and lognormal survival distributions, with independent and dependent right-censoring. The results show that Greenwood-based confidence interval width is not a reliable indicator of reproducibility: Narrow intervals may coexist with low RP, whereas wider intervals may be associated with high RP, depending on the distribution, censoring mechanism, and inferential target. In two-group comparisons, decision reproducibility is driven primarily by the stability of the ordering between group-specific quantiles rather than by the numerical precision of individual estimates, and under dependent censoring, decision reproducibility can be high even when confidence intervals are wide. These findings highlight a fundamental distinction between numerical precision and inferential reproducibility in survival analysis and underscore the need to assess reproducibility alongside conventional confidence-interval reporting.

References

[1]	E. L. Kaplan, P. Meier, Nonparametric estimation from incomplete observations, J. Am. Stat. Assoc., 53 (1958), 457–481. https://doi.org/10.1080/01621459.1958.10501452 doi: 10.1080/01621459.1958.10501452
[2]	M. Greenwood, The natural duration of cancer, In: Reports on Public Health and Medical Subjects, Ministry of Health, 1926.
[3]	S. N. Goodman, A comment on replication, $p$-values and evidence, Stat. Med., 11 (1992), 875–879. https://doi.org/10.1002/sim.4780110705 doi: 10.1002/sim.4780110705
[4]	J. Shao, S. C. Chow, Reproducibility probability in clinical trials, Stat. Med., 21 (2002), 1727–1742. https://doi.org/10.1002/sim.1177 doi: 10.1002/sim.1177
[5]	D. De Martini, Reproducibility probability estimation for testing statistics, Stat. Probab. Lett., 78 (2008), 1056–1061. https://doi.org/10.1016/j.spl.2007.09.064 doi: 10.1016/j.spl.2007.09.064
[6]	D. D. Boos, L. A. Stefanski, P-value precision and reproducibility, Am. Stat., 65 (2011), 213–221. https://doi.org/10.1198/tas.2011.10129 doi: 10.1198/tas.2011.10129
[7]	T. R. Fleming, D. P. Harrington, Counting processes and survival analysis, New York: John Wiley & Sons, 1991. https://doi.org/10.1002/9781118150672
[8]	P. K. Andersen, Ø. Borgan, R. D. Gill, N. Keiding, Statistical models based on counting processes, New York: Springer-Verlag, 1993. https://doi.org/10.1007/978-1-4612-4348-9
[9]	M. A. Hernán, J. M. Robins, Causal Inference: What If.
[10]	B. Efron, Bootstrap methods: Another look at the jackknife, Ann. Statist., 7 (1979), 1–26. https://doi.org/10.1214/aos/1176344552 doi: 10.1214/aos/1176344552
[11]	B. Efron, R. J. Tibshirani, An introduction to the bootstrap, New York: Chapman & Hall, 1994. https://doi.org/10.1201/9780429246593
[12]	A. C. Davison, D. V. Hinkley, Bootstrap methods and their application, Cambridge University Press, 1997. https://doi.org/10.1017/CBO9780511802843

Reader Comments

Your name:*

Email:*
© 2026 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)