Cardiovascular disease (CVD) is a leading cause of mortality worldwide, and it is of utmost importance to accurately assess the risk of cardiovascular disease for prevention and intervention purposes. In recent years, machine learning has shown significant advancements in the field of cardiovascular disease risk prediction. In this context, we propose a novel framework known as CVD-OCSCatBoost, designed for the precise prediction of cardiovascular disease risk and the assessment of various risk factors. The framework utilizes Lasso regression for feature selection and incorporates an optimized category-boosting tree (CatBoost) model. Furthermore, we propose the opposition-based learning cuckoo search (OCS) algorithm. By integrating OCS with the CatBoost model, our objective is to develop OCSCatBoost, an enhanced classifier offering improved accuracy and efficiency in predicting CVD. Extensive comparisons with popular algorithms like the particle swarm optimization (PSO) algorithm, the seagull optimization algorithm (SOA), the cuckoo search algorithm (CS), K-nearest-neighbor classification, decision tree, logistic regression, grid-search support vector machine (SVM), grid-search XGBoost, default CatBoost, and grid-search CatBoost validate the efficacy of the OCSCatBoost algorithm. The experimental results demonstrate that the OCSCatBoost model achieves superior performance compared to other models, with overall accuracy, recall, and AUC values of 73.67%, 72.17%, and 0.8024, respectively. These outcomes highlight the potential of CVD-OCSCatBoost for improving cardiovascular disease risk prediction.
Citation: Zhaobin Qiu, Ying Qiao, Wanyuan Shi, Xiaoqian Liu. A robust framework for enhancing cardiovascular disease risk prediction using an optimized category boosting model[J]. Mathematical Biosciences and Engineering, 2024, 21(2): 2943-2969. doi: 10.3934/mbe.2024131
[1] | Yali Ouyang, Zhuhuang Zhou, Weiwei Wu, Jin Tian, Feng Xu, Shuicai Wu, Po-Hsiang Tsui . A review of ultrasound detection methods for breast microcalcification. Mathematical Biosciences and Engineering, 2019, 16(4): 1761-1785. doi: 10.3934/mbe.2019085 |
[2] | Qun Xia, Yangmei Cheng, Jinhua Hu, Juxia Huang, Yi Yu, Hongjuan Xie, Jun Wang . Differential diagnosis of breast cancer assisted by S-Detect artificial intelligence system. Mathematical Biosciences and Engineering, 2021, 18(4): 3680-3689. doi: 10.3934/mbe.2021184 |
[3] | Yue Zhang, Haitao Gan, Furong Wang, Xinyao Cheng, Xiaoyan Wu, Jiaxuan Yan, Zhi Yang, Ran Zhou . A self-supervised fusion network for carotid plaque ultrasound image classification. Mathematical Biosciences and Engineering, 2024, 21(2): 3110-3128. doi: 10.3934/mbe.2024138 |
[4] | Haiyan Song, Cuihong Liu, Shengnan Li, Peixiao Zhang . TS-GCN: A novel tumor segmentation method integrating transformer and GCN. Mathematical Biosciences and Engineering, 2023, 20(10): 18173-18190. doi: 10.3934/mbe.2023807 |
[5] | Xiaoyue Fang, Ran Zhou, Haitao Gan, Mingyue Ding, Ming Yuchi . Time-of-flight completion in ultrasound computed tomography based on the singular value threshold algorithm. Mathematical Biosciences and Engineering, 2022, 19(10): 10160-10175. doi: 10.3934/mbe.2022476 |
[6] | Salman Lari, Hossein Rajabzadeh, Mohammad Kohandel, Hyock Ju Kwon . A holistic physics-informed neural network solution for precise destruction of breast tumors using focused ultrasound on a realistic breast model. Mathematical Biosciences and Engineering, 2024, 21(10): 7337-7372. doi: 10.3934/mbe.2024323 |
[7] | Hong Yu, Wenhuan Lu, Qilong Sun, Haiqiang Shi, Jianguo Wei, Zhe Wang, Xiaoman Wang, Naixue Xiong . Design and analysis of a robust breast cancer diagnostic system based on multimode MR images. Mathematical Biosciences and Engineering, 2021, 18(4): 3578-3597. doi: 10.3934/mbe.2021180 |
[8] | Chenkai Chang, Fei Qi, Chang Xu, Yiwei Shen, Qingwu Li . A dual-modal dynamic contour-based method for cervical vascular ultrasound image instance segmentation. Mathematical Biosciences and Engineering, 2024, 21(1): 1038-1057. doi: 10.3934/mbe.2024043 |
[9] | Xiaochen Liu, Weidong He, Yinghui Zhang, Shixuan Yao, Ze Cui . Effect of dual-convolutional neural network model fusion for Aluminum profile surface defects classification and recognition. Mathematical Biosciences and Engineering, 2022, 19(1): 997-1025. doi: 10.3934/mbe.2022046 |
[10] | Jiajun Zhu, Rui Zhang, Haifei Zhang . An MRI brain tumor segmentation method based on improved U-Net. Mathematical Biosciences and Engineering, 2024, 21(1): 778-791. doi: 10.3934/mbe.2024033 |
Cardiovascular disease (CVD) is a leading cause of mortality worldwide, and it is of utmost importance to accurately assess the risk of cardiovascular disease for prevention and intervention purposes. In recent years, machine learning has shown significant advancements in the field of cardiovascular disease risk prediction. In this context, we propose a novel framework known as CVD-OCSCatBoost, designed for the precise prediction of cardiovascular disease risk and the assessment of various risk factors. The framework utilizes Lasso regression for feature selection and incorporates an optimized category-boosting tree (CatBoost) model. Furthermore, we propose the opposition-based learning cuckoo search (OCS) algorithm. By integrating OCS with the CatBoost model, our objective is to develop OCSCatBoost, an enhanced classifier offering improved accuracy and efficiency in predicting CVD. Extensive comparisons with popular algorithms like the particle swarm optimization (PSO) algorithm, the seagull optimization algorithm (SOA), the cuckoo search algorithm (CS), K-nearest-neighbor classification, decision tree, logistic regression, grid-search support vector machine (SVM), grid-search XGBoost, default CatBoost, and grid-search CatBoost validate the efficacy of the OCSCatBoost algorithm. The experimental results demonstrate that the OCSCatBoost model achieves superior performance compared to other models, with overall accuracy, recall, and AUC values of 73.67%, 72.17%, and 0.8024, respectively. These outcomes highlight the potential of CVD-OCSCatBoost for improving cardiovascular disease risk prediction.
[1] |
W. B. Kannel, D. Mcgee, T. Gordon, A general cardiovascular risk profile: The Frmingham study, Am. J. Cardiol., 38 (1976), 46–51. https://doi.org/10.1016/0002-9149(76)90061-8 doi: 10.1016/0002-9149(76)90061-8
![]() |
[2] |
R. M. Conroy, K. Pyoral, A. P. Fitzgerald, S. Sans, A. Menotti, G. De Backer, et al., Estimation of ten-year risk of fatal cardiovascular disease in Europe: The SCORE project, Eur. Heart J., 24 (2003), 987–1003. https://doi.org/10.1016/S0195-668X(03)00114-3 doi: 10.1016/S0195-668X(03)00114-3
![]() |
[3] |
C. Hippisley, Derivation and validation of QRISK, a new cardiovascular diseaserisk score for the United Kingdom: Prospective open cohort study, BMJ, 335 (2007), 136. https://doi.org/10.1136/bmj.39261.471806.55 doi: 10.1136/bmj.39261.471806.55
![]() |
[4] |
S. F. Weng, J. Reps, J. Kai, Can machine-learning improve cardiovascular risk prediction using routine clinical data, PLoS ONE, 12 (2017), e0174944. https://doi.org/10.1371/journal.pone.0174944 doi: 10.1371/journal.pone.0174944
![]() |
[5] |
A. C. Dimopoulos, M. Nikolaidou, F. F. Caballero, Machine learning methodologies versus cardiovascular risk scores, in predicting disease risk, BMC Med. Res. Methodol., 18 (2018). https://doi.org/10.1186/s12874-018-0644-1 doi: 10.1186/s12874-018-0644-1
![]() |
[6] |
W. Huang, T. W. Ying, W. L. C. Chin, Application of ensemble machine learning algorithms on lifestyle factors and wearables for cardiovascular risk prediction, Sci. Rep., 12 (2022), 1033. https://doi.org/10.1038/s41598-021-04649-y doi: 10.1038/s41598-021-04649-y
![]() |
[7] |
M. Ordikhani, M. S. Abadeh, C. Prugger, An evolutionary machine learning algorithm for cardiovascular disease risk prediction, PLoS ONE, 17 (2022), e0271723. https://doi.org/10.1371/journal.pone.0271723 doi: 10.1371/journal.pone.0271723
![]() |
[8] |
M. Pal, S. Parija, G. Panda, K. Dhama, R. K. Mohapatra, Risk prediction of cardiovascular disease using machine learning classifiers, Open Med., 17 (2022), 1100–1113. https://doi.org/10.1515/med-2022-0508 doi: 10.1515/med-2022-0508
![]() |
[9] |
L. R. Guarneros-Nolasco, N. A. Cruz-Ramos, G. Alor-Hernández, L. Rodríguez-Mazahua, J. L. Sánchez-Cervantes, Identifying the main risk factors for cardiovascular diseases prediction using machine learning algorithms, Mathematics, 9 (2021), 2537. https://doi.org/10.3390/math9202537 doi: 10.3390/math9202537
![]() |
[10] |
M. M. Ali, B. K. Paul, K. Ahmed, F. M. Bui, J. M. W. Quinn, M. A. Moni, Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison, Comput. Biol. Med., 136(2021), 104672. https://doi.org/10.1016/j.compbiomed.2021.104672 doi: 10.1016/j.compbiomed.2021.104672
![]() |
[11] |
K. Kanagarathinam, D. Sankaran, R. Manikandan, Machine learning-based risk prediction model for cardiovascular disease using a hybrid dataset, Data Knowl. Eng., 140 (2022), 102042. https://doi.org/10.1016/j.datak.2022.102042 doi: 10.1016/j.datak.2022.102042
![]() |
[12] |
J. M. Sung, I. J. Cho, D. Sung, S. Kim, Development and verification of prediction models for preventing cardiovascular diseases, PLoS ONE, 14 (2019), e0222809. https://doi.org/10.1371/journal.pone.0222809 doi: 10.1371/journal.pone.0222809
![]() |
[13] |
Y. Pan, M. Fu, B. Cheng, X. Tao, J. Guo, Enhanced deep learning assisted convolutional neural network for heart disease prediction on the internet of medical things platform, IEEE Access, 8 (2020), 189503–189512. https://doi.org/10.1109/ACCESS.2020.3026214 doi: 10.1109/ACCESS.2020.3026214
![]() |
[14] |
S. K. Pandey, R. R. Janghel, Automatic detection of arrhythmia from imbalanced ECG database using CNN model with SMOTE, Australas. Phys. Eng. Sci. Med., 42 (2019), 1129–1139. https://doi.org/10.1007/s13246-019-00815-9 doi: 10.1007/s13246-019-00815-9
![]() |
[15] |
L. Ali, A. Rahman, A. Khan, M. Zhou, A. Javeed, J. A. Khan, An automated diagnostic system for heart disease prediction based on χ2 statistical model and optimally configured deep neural network, IEEE Access, 7 (2019), 34938–34945. https://doi.org/10.1109/ACCESS.2019.2904800 doi: 10.1109/ACCESS.2019.2904800
![]() |
[16] |
I. D. Mienye, Y. Sun, Z. Wang, An improved ensemble learning approach for the prediction of heart disease risk, Inf. Med. Unlocked, 20 (2020), 100402. https://doi.org/10.1016/j.imu.2020.100402 doi: 10.1016/j.imu.2020.100402
![]() |
[17] |
S. Pandya, T. R. Gadekallu, P. K. Reddy, W. Wang, M. Alazab, InfusedHeart: A novel knowledge-infused learning framework for diagnosis of cardiovascular events, IEEE Trans. Comput. Soc. Syst., 2022 (2022). https://doi.org/10.1109/TCSS.2022.3151643 doi: 10.1109/TCSS.2022.3151643
![]() |
[18] |
P. Srinivas, R. Katarya, HyOPTXg: OPTUNA hyper-parameter optimization framework for predicting cardiovascular disease using XGBoost, Biomed. Signal Process. Control, 73 (2022), 103456. https://doi.org/10.1016/j.bspc.2021.103456 doi: 10.1016/j.bspc.2021.103456
![]() |
[19] |
V. Baviskar, M. Verma, P. Chatterjee, G. Singal, T. R. Gadekallu, Optimization using internet of agent based stacked sparse autoencoder model for heart disease prediction, Exp. Syst., 2023 (2023), e13359. https://doi.org/10.1111/exsy.13359 doi: 10.1111/exsy.13359
![]() |
[20] |
X. Wei, C. Rao, X. Xiao, L. Chen, M. Goh, Risk assessment of cardiovascular disease based on SOLSSA-CatBoost model, Exp. Syst. Appl., 219 (2023), 119648. https://doi.org/10.1016/j.eswa.2023.119648 doi: 10.1016/j.eswa.2023.119648
![]() |
[21] |
A. S. Kumar, R. Rekha, An improved hawks optimizer based learning algorithms for cardiovascular disease prediction, Biomed. Signal Process. Control, 81 (2023), 104442. https://doi.org/10.1016/j.bspc.2022.104442 doi: 10.1016/j.bspc.2022.104442
![]() |
[22] | X. S. Yang, Cuckoo search via Lxevy flights, in 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), (2009), 210–214. https://doi.org/10.1109/NABIC.2009.5393690 |
[23] | H. R. Tizhoosh, Opposition-based learning: a new scheme for machine intelligence, in Proceedings of IEEE International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce(CIMCA-IAWTIC06, (2005), 695–701. https://doi.org/10.1109/cimca.2005.1631345 |
[24] |
A. A. Ewees, A. E. Mohamed, E. H. Houssein, Improved grasshopper optimization algorithm using opposition-based learning, Exp. Syst. Appl., 112 (2018), 156–172. https://doi.org/10.1016/j.eswa.2018.06.023 doi: 10.1016/j.eswa.2018.06.023
![]() |
[25] |
X. Yu, W. Xu, C. Li, Opposition-based learning grey wolf optimizer for global optimization, Knowl.-Based Syst., 226 (2021), 107139. https://doi.org/10.1016/j.knosys.2021.107139 doi: 10.1016/j.knosys.2021.107139
![]() |
[26] |
M. Khishe, Greedy opposition-based learning for chimp optimization algorithm, Artif. Intell. Rev., 56 (2022), 7633–7663. https://doi.org/10.1007/s10462-022-10343-w doi: 10.1007/s10462-022-10343-w
![]() |
[27] |
M. Imran, S. Khan, H. Hlavacs, Intrusion detection in networks using cuckoo search optimization, Soft Comput., 26 (2022), 10651–10663. https://doi.org/10.1007/s00500-022-06798-2 doi: 10.1007/s00500-022-06798-2
![]() |
[28] |
B. Jia, B. Yu, Q. Wu, Adaptive affinity propagation method based on improved cuckoo search, Knowl.-Based Syst., 111 (2016), 27–35. https://doi.org/10.1016/j.knosys.2016.07.039 doi: 10.1016/j.knosys.2016.07.039
![]() |
[29] |
S. Chakraborty, K. Mali, Fuzzy and elitist cuckoo search based microscopic image segmentation approach, Appl. Soft Comput., 130 (2022), 109671. https://doi.org/10.1016/j.asoc.2022.109671 doi: 10.1016/j.asoc.2022.109671
![]() |
[30] |
P. N. Maddaiah, P. P. Narayanan, An improved Cuckoo search algorithm for optimization of artificial neural network training, Neural Process. Lett., 2023 (2023), 1–28. https://doi.org/10.1007/s11063-023-11411-0 doi: 10.1007/s11063-023-11411-0
![]() |
[31] | R. Eberhart, K. James, A new optimizer using particle swarm theory, in Proceedings of the Sixth International Symposium on Micro Machine and Human Science, (1995), 39–43. https://doi.org/10.1109/mhs.1995.494215 |
[32] |
G. Dhiman, V. Kumar, Seagull optimization algorithm: Theory and its applications for largescale industrial engineering problems, Knowl.-Based Syst., 165 (2019), 169–196. https://doi.org/10.1016/j.knosys.2018.11.024 doi: 10.1016/j.knosys.2018.11.024
![]() |
[33] | J. Maiga, G. G. Hungilo, Comparison of machine learning models in prediction of cardiovascular disease using health record data, in 2019 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS), (2019), 45–48. https://doi.org/10.1109/ICIMCIS48181.2019.8985205 |
[34] | A. Nikam, S. Bhandari, A. Mhaske, S. Mantri, Cardiovascular disease prediction using machine learning models, in 2020 IEEE Pune Section International Conference (PuneCon), (2020), 22–27. https://doi.org/10.1109/PuneCon50868.2020.9362367 |
[35] |
J. C. T. Arroyo, A. J. P. Delima, An optimized neural network using genetic algorithm for cardiovascular disease prediction, J. Adv. Inf. Technol., 13 (2022), 95–99. https://doi.org/10.12720/jait.13.1.95-99 doi: 10.12720/jait.13.1.95-99
![]() |
[36] |
M. Peng, F. Hou, Z. Cheng, T. Shen, K. Liu, C. Zhao, et al., A cardiovascular disease risk score model based on high contribution characteristics, Appl. Sci., 13 (2023), 893. https://doi.org/10.3390/app13020893 doi: 10.3390/app13020893
![]() |
[37] |
T. B. Olesen, M. Pareek, The influence of age and sex on the prognostic importance of traditional cardiovascular risk factors, selected circulating biomarkers and other markers of subclinical cardiovascular damage, Curr. Opin. Cardiol., 38 (2023), 21–31. https://doi.org/10.1097/hco.0000000000001005 doi: 10.1097/hco.0000000000001005
![]() |
[38] |
E. Harold, P. R. Bays, E. E. Taub, Ten things to know about ten cardiovascular disease risk factors, Am. J. Prev. Cardiol., 5 (2021), 100149. https://doi.org/10.1016/j.ajpc.2021.100149 doi: 10.1016/j.ajpc.2021.100149
![]() |
[39] |
C. Phanish, B. Radhika, Assessing the risk factors associated with cardiovascular disease, Eur. J. Prev. Cardiol., 25 (2018), 932–933. https://doi.org/10.1177/2047487318778652 doi: 10.1177/2047487318778652
![]() |
[40] |
A. Arafa, H. H. Lee, E. S. Eshak, K. Shirai, K. Liu, J. Li, et al., Modifiable risk factors for cardiovascular disease in Korea and Japan, Korean Circ. J., 51 (2021), 643–655. https://doi.org/10.4070/kcj.2021.0121 doi: 10.4070/kcj.2021.0121
![]() |
[41] |
M. George, K. George, T. Athanasios, Cardiovascular disease in Greece; the latest evidence on risk factors, Hell. J. Cardiol., 60 (2019), 271–275. https://doi.org/10.1016/j.hjc.2018.09.006 doi: 10.1016/j.hjc.2018.09.006
![]() |
[42] | P. Zhao, H. Li, Opposition-based Cuckoo search algorithm for optimization problems, in 2012 Fifth International Symposium on Computational Intelligence and Design, (2012), 344–347. https://doi.org/10.1109/ISCID.2012.93 |
[43] |
N. A. Baghdadi, S. M. F. Abdelaliem, A. Malki, I. Gad, A. Ewis, E. Atlam, Advanced machine learning techniques for cardiovascular disease early detection and diagnosis, J. Big Data, 10 (2023). https://doi.org/10.1186/s40537-023-00817-1 doi: 10.1186/s40537-023-00817-1
![]() |
[44] |
H. Huan, F. Zhen, L. Hai, J. Cheng, J. Lyu, Y. Zhang, et al., Gene function and cell surface protein association analysis based on single-cell multiomics data, Comput. Biol. Med., 157 (2023), 106733. https://doi.org/10.1016/j.compbiomed.2023.106733 doi: 10.1016/j.compbiomed.2023.106733
![]() |
[45] |
R. Meng, S. Yin, J. Sun, H. Hu, Q Zhao, ScAAGA: Single cell data analysis framework using asymmetric autoencoder with gene attention, Comput. Biol. Med., 165 (2023), 107414. https://doi.org/10.1016/j.compbiomed.2023.107414 doi: 10.1016/j.compbiomed.2023.107414
![]() |
[46] |
H. Gao, J. Sun, Y. Wang, Y. Lu, L. Liu, Q. Zhao, et al., Predicting metabolite–disease associations based on auto-encoder and non-negative matrix factorization, Briefings Bioinf., 24 (2023), bbad259. https://doi.org/10.1093/bib/bbad259 doi: 10.1093/bib/bbad259
![]() |
[47] |
W. Wang, L. Zhang, J. Sun, Q. Zhao, J. Shuai, Predicting the potential human lncRNA–miRNA interactions based on graph convolution network with conditional random field, Briefings Bioinf., 23 (2022), bbac463. https://doi.org/10.1093/bib/bbac463 doi: 10.1093/bib/bbac463
![]() |
[48] |
L. Zhang, P. Yang, H. Feng, Q. Zhao, H. Liu, Using network distance analysis to predict lncRNA–miRNA interactions, Interdiscip. Sci. Comput. Life Sci., 13 (2021), 535–545. https://doi.org/10.1007/s12539-021-00458-z doi: 10.1007/s12539-021-00458-z
![]() |
[49] |
F. Sun, J. Sun, Q. Zhao, A deep learning method for predicting metabolite–disease associations via graph neural network, Briefings Bioinf., 23 (2022), bbac266. https://doi.org/10.1093/bib/bbac266 doi: 10.1093/bib/bbac266
![]() |
[50] |
T. Wang, J. Sun, Q. Zhao, Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism, Comput. Biol. Med., 153 (2023), 106464. https://doi.org/10.1016/j.compbiomed.2022.106464 doi: 10.1016/j.compbiomed.2022.106464
![]() |
[51] |
Z. Chen, L. Zhang, J. Sun, R. Meng, S. Yin, Q. Zhao, DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction, J. Cell Mol. Med., 27 (2023), 3117–3126. https://doi.org/10.1111/jcmm.17889 doi: 10.1111/jcmm.17889
![]() |
1. | Leonid Shaikhet, About one method of stability investigation for nonlinear stochastic delay differential equations, 2021, 1049-8923, 10.1002/rnc.5440 | |
2. | Leonid Shaikhet, Stability of a positive equilibrium state for a stochastically perturbed mathematical model ofglassy-winged sharpshooter population, 2014, 11, 1551-0018, 1167, 10.3934/mbe.2014.11.1167 | |
3. | Federico Lessio, Alberto Alma, Models Applied to Grapevine Pests: A Review, 2021, 12, 2075-4450, 169, 10.3390/insects12020169 | |
4. | Sofia G. Seabra, Ana S.B. Rodrigues, Sara E. Silva, Ana Carina Neto, Francisco Pina-Martins, Eduardo Marabuto, Vinton Thompson, Michael R. Wilson, Selçuk Yurtsever, Antti Halkka, Maria Teresa Rebelo, Paulo A.V. Borges, José A. Quartau, Chris D. Jiggins, Octávio S. Paulo, Population structure, adaptation and divergence of the meadow spittlebug, Philaenus spumarius (Hemiptera, Aphrophoridae), revealed by genomic and morphological data, 2021, 9, 2167-8359, e11425, 10.7717/peerj.11425 | |
5. | Leonid Shaikhet, Some Generalization of the Method of Stability Investigation for Nonlinear Stochastic Delay Differential Equations, 2022, 14, 2073-8994, 1734, 10.3390/sym14081734 | |
6. | Leonid Shaikhet, About stability of a mathematical model of Glassy-winged Sharpshooter population under Poisson’s jumps, 2025, 08939659, 109523, 10.1016/j.aml.2025.109523 |