### Mathematical Biosciences and Engineering

2020, Issue 5: 4544-4562. doi: 10.3934/mbe.2020251
Research article Special Issues

# Predicting disease risks by matching quantiles estimation for censored data

• Received: 29 March 2020 Accepted: 18 June 2020 Published: 29 June 2020
• In time to event data analysis, it is often of interest to predict quantities such as t-year survival rate or the survival function over a continuum of time. A commonly used approach is to relate the survival time to the covariates by a semiparametric regression model and then use the fitted model for prediction, which usually results in direct estimation of the conditional hazard function or the conditional estimating equation. Its prediction accuracy, however, relies on the correct specification of the covariate-survival association which is often difficult in practice, especially when patient populations are heterogeneous or the underlying model is complex. In this paper, from a prediction perspective, we propose a disease-risk prediction approach by matching an optimal combination of covariates with the survival time in terms of distribution quantiles. The proposed method is easy to implement and works flexibly without assuming a priori model. The redistribution-of-mass technique is adopted to accommodate censoring. We establish theoretical properties of the proposed method. Simulation studies and a real data example are also provided to further illustrate its practical utilities.

Citation: Peng Wu, Baosheng Liang, Yifan Xia, Xingwei Tong. Predicting disease risks by matching quantiles estimation for censored data[J]. Mathematical Biosciences and Engineering, 2020, 17(5): 4544-4562. doi: 10.3934/mbe.2020251

### Related Papers:

• In time to event data analysis, it is often of interest to predict quantities such as t-year survival rate or the survival function over a continuum of time. A commonly used approach is to relate the survival time to the covariates by a semiparametric regression model and then use the fitted model for prediction, which usually results in direct estimation of the conditional hazard function or the conditional estimating equation. Its prediction accuracy, however, relies on the correct specification of the covariate-survival association which is often difficult in practice, especially when patient populations are heterogeneous or the underlying model is complex. In this paper, from a prediction perspective, we propose a disease-risk prediction approach by matching an optimal combination of covariates with the survival time in terms of distribution quantiles. The proposed method is easy to implement and works flexibly without assuming a priori model. The redistribution-of-mass technique is adopted to accommodate censoring. We establish theoretical properties of the proposed method. Simulation studies and a real data example are also provided to further illustrate its practical utilities.

 [1] J. Buckley, I. James, Linear regression with censored data, Biometrika, 66 (1979), 429-436. [2] L. J. Wei, The accelerated failure time model: A useful alternative to the Cox regression model in survival analysis, Stat. Med., 11 (1992), 1871-1879. [3] S. C. Cheng, L. J. Wei, Z. Ying, Analysis of transformation models with censored data, Biometrika, 82 (1995), 835-845. [4] Z. Jin, D. Y. Lin, L. J. Wei, Z. Ying, Rank based inference for the accelerated failure time model, Biometrika, 90 (2003), 341-353. [5] D.R. Cox, Regression models and life-tables (with discussion), J. R. Stat. Soc., 34 (1972), 187-220. [6] D. Y. Lin, Z. Ying, Semiparametric analysis of the additive risk model, Biometrika, 81 (1994), 61-71. [7] D. Zeng, D. Lin, Maximum likelihood estimation in semiparametric regression models with censored data, J. R. Stat. Soc., 69 (2007), 507-564. [8] S. Portnoy, Censored regression quantiles, J. Am. Stat. Assoc., 98 (2003), 1001-1012. [9] R. Koenker, Censored quantile regression redux, J. Stat. Software, 27 (2008), 1-25. [10] H. J. Wang, L. Wang, Locally weighted censored quantile regression, J. Am. Stat. Assoc., 104 (2009), 1117-1128. [11] R. Henderson, N. Keiding, Individual survival time prediction using statistical models, J. of Med. Ethics, 31 (2005), 703-706. [12] Z. Karian, E. Dudewicz, Fitting the generalized Lambda distribution to data: A method based on percentiles, Commun. Stat. Simul. Comput., 28 (1999), 793-819. [13] Y. Dominicy, D. Veredas, The method of simulated quantiles, J. Econometrics, 172 (2013), 235-247. [14] N. Sgouropoulos, Q. Yao, C. Yastremiz, Matching a distribution by matching quantiles estimation, J. Am. Stat. Assoc., 110 (2015), 742-759. [15] R. Koenker, K. F. Hallock, Quantile Regression, J. Econ. Perspect. 15 (2001), 143-156. [16] H. Zou, M. Yuan, Composite quantile regression and the oracle model selection theory, Ann. Stat., 36 (2008), 1108-1126. [17] A. K. Han, Non-parametric analysis of a generalized regression model: The maximum rank correlation estimator J. Econometrics, 35 (1987), 303-316. [18] R. P. Sherman, The limiting distribution of the maximum rank correlation estimator, Econometrica, 61 (1993), 123-137. [19] S. Khan, E. Tamer, Partial rank estimation of duration models with general forms of censoring, J. Econometrics, 136 (2007), 251-280. [20] B. Efron, The two sample problem with censored data, In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, 1967, 831-853. Available from: http://www.med.mcgill.ca/epidemiology/hanley/bios601/SurvivalAnalysis/Efron1967.pdf. [21] D. M. Dabrowska, Uniform consistency of the kernel conditional Kaplan-Meier estimate, Ann. Stat., 17 (1989), 1157-1167. [22] J. Kiefer, Deviations between the sample quantile process and the sample df, Nonparametric Tech. Stat. Inference, 1970 (1970), 299-319. [23] B. M. Brown, Y. G. Wang, Induced smoothing for rank regression with censored survival times, Stat. Med., 26 (2007), 828-836. [24] Z. Jin, D. Y. Lin, Z. Ying, On least-squares regression with censored data, Biometrika, 93 (2006), 147-161. [25] J. D. Kalbfleisch, R. L. Prentice, The Statistical Analysis of Failure Time Data, John Wiley & Sons, New York, 2011. [26] W. Gonzalez-Manteiga, C. Cadarso-Suarez, Asymptotic properties of a generalized Kaplan-Meier estimator with some applications, J. Nonparametric Stat., 4 (1994), 65-78. [27] C. L. Leng, X. W. Tong, Censored quantile regression via Box-Cox transformation under conditional independence, Stat. Sinica, 24 (2014), 221-249.
###### 通讯作者: 陈斌, bchen63@163.com
• 1.

沈阳化工大学材料科学与工程学院 沈阳 110142

2.080 2.1

Article outline

## Figures and Tables

Figures(3)  /  Tables(5)

• On This Site