As the number of patients with heart failure increases, machine learning (ML) has garnered attention in cardiomyopathy diagnosis, driven by the shortage of pathologists. However, endomyocardial biopsy specimens are often limited in sample size and require techniques such as feature extraction and dimensionality reduction. This study investigated the effectiveness of texture features in the context of feature extraction for the pathological diagnosis of cardiomyopathy. Furthermore, model designs that contributed to improving generalization performance were examined by applying feature selection (FS) and dimensional compression (DC) to several ML models. The obtained results were verified by visualizing the inter-class distribution differences and conducting statistical hypothesis testing based on texture features. Additionally, they were evaluated using predictive performance across different model designs with varying combinations of FS and DC (applied or not) and decision boundaries. The obtained results confirmed that texture features may be effective for the pathological diagnosis of cardiomyopathy. Moreover, when the ratio of features to the sample size is high, a multi-step process involving FS and DC improved the generalization performance, with the linear kernel support vector machine achieving the best results. This process was demonstrated to be potentially effective for models with reduced complexity, regardless of whether the decision boundaries were linear, curved, perpendicular, or parallel to the axes. These findings are expected to facilitate the development of an effective cardiomyopathy diagnostic model for its rapid adoption in medical practice.
Citation: Masaya Mori, Yuto Omae, Yutaka Koyama, Kazuyuki Hara, Jun Toyotani, Yasuo Okumura, Hiroyuki Hao. Cardiomyopathy diagnosis model from endomyocardial biopsy specimens: Appropriate feature space and class boundary in small sample size data[J]. AIMS Bioengineering, 2025, 12(2): 283-313. doi: 10.3934/bioeng.2025014
As the number of patients with heart failure increases, machine learning (ML) has garnered attention in cardiomyopathy diagnosis, driven by the shortage of pathologists. However, endomyocardial biopsy specimens are often limited in sample size and require techniques such as feature extraction and dimensionality reduction. This study investigated the effectiveness of texture features in the context of feature extraction for the pathological diagnosis of cardiomyopathy. Furthermore, model designs that contributed to improving generalization performance were examined by applying feature selection (FS) and dimensional compression (DC) to several ML models. The obtained results were verified by visualizing the inter-class distribution differences and conducting statistical hypothesis testing based on texture features. Additionally, they were evaluated using predictive performance across different model designs with varying combinations of FS and DC (applied or not) and decision boundaries. The obtained results confirmed that texture features may be effective for the pathological diagnosis of cardiomyopathy. Moreover, when the ratio of features to the sample size is high, a multi-step process involving FS and DC improved the generalization performance, with the linear kernel support vector machine achieving the best results. This process was demonstrated to be potentially effective for models with reduced complexity, regardless of whether the decision boundaries were linear, curved, perpendicular, or parallel to the axes. These findings are expected to facilitate the development of an effective cardiomyopathy diagnostic model for its rapid adoption in medical practice.
| [1] |
Ishibashi-Ueda H, Matsuyama TA, Ohta-Ogo K, Ikeda Y (2017) Significance and value of endomyocardial biopsy based on our own experience. Circ J 81: 417-426. https://doi.org/10.1253/circj.CJ-16-0927
|
| [2] |
Leone O, Veinot JP, Angelini A, Baandrup UT, Basso C, Berry G, et al. (2012) 2011 consensus statement on endomyocardial biopsy from the association for european cardiovascular pathology and the society for cardiovascular pathology. Cardiovasc Pathol 21: 245-274. https://doi.org/10.1016/j.carpath.2011.10.001
|
| [3] |
Cooper LT, Baughman KL, Feldman AM, Frustaci A, Jessup M, Kuhl U, et al. (2007) The role of endomyocardial biopsy in the management of cardiovascular disease: a scientific statement from the american heart association, the american college of cardiology, and the european society of cardiology. Circulation 116: 2216-2233. https://doi.org/10.1161/CIRCULATIONAHA.107.186093
|
| [4] |
Nirschl JJ, Janowczyk A, Peyster EG, Frank R, Margulies KB, Feldman MD, et al. (2018) A deep-learning classifier identifies patients with clinical heart failure using whole-slide images of H&E tissue. PloS one 13: e0192726. https://doi.org/10.1371/journal.pone.0192726
|
| [5] |
Pallua JD, Brunner A, Zelger B, Schirmer M, Haybaeck J (2020) The future of pathology is digital. Pathol-Res Pract 216: 153040. https://doi.org/10.1016/j.prp.2020.153040
|
| [6] |
Cai L, Gao J, Zhao D (2020) A review of the application of deep learning in medical image classification and segmentation. Ann TransL Med 8: 713. https://doi.org/10.21037/atm.2020.02.44
|
| [7] |
Li M, Zhang Y (2023) Medical image analysis using deep learning algorithms. Front Public Health 11: 1273253. https://doi.org/10.3389/fpubh.2023.1273253
|
| [8] |
Liu YH (2018) Feature extraction and image recognition with convolutional neural networks. in: Journal of Physics: Conference Series . IOP Publishing 062032. https://doi.org/10.1088/1742-6596/1087/6/062032
|
| [9] |
Ergun H, Akyuz YC, Sert M, Liu J (2016) Early and late level fusion of deep convolutional neural networks for visual concept recognition. Int J Semant Comput 10: 379-397. https://doi.org/10.1142/S1793351X16400158
|
| [10] | Lokesh S, Priya A, Sakhare DT, Devi RM, Sahu DN, Reddy PCS (2016) CNN based deep learning methods for precise analysis of cardiac arrhythmias. Int J Health Sci 6: 10808-10819. https://doi.org/10.53730/ijhs.v6nS1.7596 |
| [11] |
Han SS, Park GH, Lim W, Kim MS, Na JI, Park I, et al. (2018) Deep neural networks show an equivalent and often superior performance to dermatologists in onychomycosis diagnosis: Automatic construction of onychomycosis datasets by region-based convolutional deep neural network. PloS one 13: e0191493. https://doi.org/10.1371/journal.pone.0191493
|
| [12] |
Nagpal K, Foote D, Liu Y, Chen PHC, Wulczyn E, Tan F, et al. (2019) Development and validation of a deep learning algorithm for improving gleason scoring of prostate cancer. NPJ Digit Med 2: 48. https://doi.org/10.1038/s41746-019-0112-2
|
| [13] |
Khalid H, Hussain M, Al Ghamdi MA, Khalid T, Khalid K, Khan MA, et al. (2020) A comparative systematic literature review on knee bone reports from mri, x-rays and ct scans using deep learning and machine learning methodologies. Diagnostics 10: 518. https://doi.org/10.3390/diagnostics10080518
|
| [14] |
Wu JX, Pai CC, Kan CD, Chen PY, Chen WL, Lin CH (2022) Chest x-ray image analysis with combining 2d and 1d convolutional neural network based classifier for rapid cardiomegaly screening. IEEE Access 10: 47824-47836. https://doi.org/10.1109/ACCESS.2022.3171811
|
| [15] |
Sharifrazi D, Alizadehsani R, Joloudari JH, Band SS, Hussain S, Sani ZA, et al. (2022) CNN-KCL: automatic myocarditis diagnosis using convolutional neural network combined with k-means clustering. MBE 19: 2381-2402. https://doi.org/10.3934/MBE.2022110
|
| [16] |
Aromiwura AA, Settle T, Umer M, Joshi J, Shotwell M, Mattumpuram J, et al. (2023) Artificial intelligence in cardiac computed tomography. Prog Cardiovasc Dis 81: 54-77. https://doi.org/10.1016/j.pcad.2023.09.001
|
| [17] | From AM, Maleszewski JJ, Rihal CS (2011) in: Current status of endomyocardial biopsy. Mayo Clinic Proceedings, Elsevier 1095-1102. https://doi.org/10.4065/mcp.2011.0296 |
| [18] | Tong L, Hoffman R, Deshpande SR, Wang MD (2017) Predicting heart rejection using histopathological whole-slide imaging and deep neural network with dropout. in 2017 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) . IEEE 1-4. https://doi.org/10.1109/BHI.2017.7897190 |
| [19] | Dooley AE, Tong L, Deshpande SR, Wang MD (2018) Prediction of heart transplant rejection using histopathological whole-slide imaging. in 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) . IEEE 251-254. https://doi.org/10.1109/BHI.2018.8333416 |
| [20] |
Vabalas A, Gowen E, Poliakoff E, Casson AJ (2019) Machine learning algorithm validation with a limited sample size. PloS one 14: e0224365. https://doi.org/10.1371/journal.pone.0224365
|
| [21] |
Porumb M, Iadanza E, Massaro S, Pecchia L (2020) A convolutional neural network approach to detect congestive heart failure. Biomed Signal Process Control 55: 101597. https://doi.org/10.1016/j.bspc.2019.101597
|
| [22] |
Yildiz A, Zan H, Said S (2021) Classification and analysis of epileptic eeg recordings using convolutional neural network and class activation mapping. Biomed signal process control 68: 102720. https://doi.org/10.1016/j.bspc.2021.102720
|
| [23] | Pikulkaew K (2023) Enhancing brain tumor detection with gradient-weighted class activation mapping and deep learning techniques. in: 2023 20th International Joint Conference on Computer Science and Software Engineering (JCSSE) . IEEE 339-344. https://doi.org/10.1109/JCSSE58229.2023.10202020 |
| [24] | Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. in: Proceedings of the IEEE conference on computer vision and pattern recognition 2016: 2921-2929. https://doi.org/10.1109/CVPR.2016.319 |
| [25] |
Amato D, Calderaro S, Lo Bosco G, Rizzo R, Vella F (2024) Explainable histopathology image classification with self-organizing maps: a granular computing perspective. Cogn Comput 16: 2999-3019. https://doi.org/10.1007/s12559-024-10312-1
|
| [26] | Kumar KK, Chaduvula K, Markapudi B (2020) A detailed survey on feature extraction techniques in image processing for medical image analysis. Eur J Mol Clin Med 7: 2275-2284. |
| [27] |
Shirani J, Pick R, Roberts WC, Maron BJ (2000) Morphology and significance of the left ventricular collagen network in young patients with hypertrophic cardiomyopathy and sudden cardiac death. J Am Coll Cardiol 35: 36-44. https://doi.org/10.1016/S0735-1097(99)00492-1
|
| [28] |
Cianci V, Forzese E, Sapienza D, Cardia L, Cianci A, Germanà A, et al. (2000) Morphological and genetic aspects for post-mortem diagnosis of hypertrophic cardiomyopathy: a systematic review. Int J Mol Sci 25: 1275. https://doi.org/10.3390/ijms25021275
|
| [29] |
Castellano G, Bonilha L, Li LM, Cendes F (2004) Texture analysis of medical images. Clin radiol 59: 1061-1069. https://doi.org/10.1016/j.crad.2004.07.008
|
| [30] |
Chowdhary CL, Acharjya DP (2020) Segmentation and feature extraction in medical imaging: a systematic review. Procedia Comput Sci 167: 26-36. https://doi.org/10.1016/j.procs.2020.03.179
|
| [31] |
Phan JH, Quo CF, Cheng C, Wang MD (2012) Multiscale integration of-omic, imaging, and clinical data in biomedical informatics. IEEE Rev Biomed Eng 5: 74-87. https://doi.org/10.1109/RBME.2012.2212427
|
| [32] |
Guan H, Liu M (2021) Domain adaptation for medical image analysis: a survey. IEEE T Bio-Med Eng 69: 1173-1185. https://doi.org/10.1109/TBME.2021.3117407
|
| [33] |
Mori M, Flores RG, Suzuki Y, Nukazawa K, Hiraoka T, Nonaka H (2022) Prediction of microcystis occurrences and analysis using machine learning in high-dimension, low-sample-size and imbalanced water quality data. Harmful Algae 117: 102273. https://doi.org/10.1016/j.hal.2022.102273
|
| [34] | Aliferis C, Simon G (2024) Overfitting, underfitting and general model overconfidence and under-performance pitfalls and best practices in machine learning and ai. in: Artificial Intelligence and Machine Learning in Health Care and Medical Sciences: Best Practices and Pitfalls . Springer 477-524. https://doi.org/10.1007/978-3-031-39355-6_10 |
| [35] | Mori M, Omae Y, Koyama Y, et al. (2025) Potential of low-dimensionalized texture features for diagnostic support of cardiomyopathy using endomyocardial biopsy specimens. in: Springer Proceedings in Mathematics & Statistics . In press |
| [36] |
Van Griethuysen JJ, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, et al. (2017) Computational radiomics system to decode the radiographic phenotype. Cancer Res 77: e104-e107. https://doi.org/10.1158/0008-5472.CAN-17-0339
|
| [37] |
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Annals Eugenics 7: 179-188. https://doi.org/10.1111/j.1469-1809.1936.tb02137.x
|
| [38] |
Brown MT, Wicker LR (2000) Discriminant analysis. in: Handbook of applied multivariate statistics and mathematical modeling . Elsevier 209-235. https://doi.org/10.1016/B978-012691360-6/50009-4
|
| [39] | Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O, et al. (2013) API design for machine learning software: experiences from the scikit-learn project. arxiv preprint arxiv:1309.0238 . https://doi.org/10.48550/arXiv.1309.0238 |
| [40] |
Heikonen S, Yli-Heikkilä M, Heino M (2023) Modeling the drivers of eutrophication in finland with a machine learning approach. Ecosphere 14: e4522. https://doi.org/10.1002/ecs2.4522
|
| [41] | Akiba T, Sano S, Yanase T, Ohta T, Koyama M (2019) Optuna: A next-generation hyperparameter optimization framework. in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . https://doi.org/10.1145/3292500.3330701 |
| [42] |
Kong P, Christia P, Frangogiannis NG (2014) The pathogenesis of cardiac fibrosis. Cell Mol Life Sci 71: 549-574. https://doi.org/10.1007/s00018-013-1349-6
|
| [43] |
Kraft T, Montag J, Radocaj A, Brenner B (2016) Hypertrophic cardiomyopathy: cell-to-cell imbalance in gene expression and contraction force as trigger for disease phenotype development. Circ Res 119: 992-995. https://doi.org/10.1161/CIRCRESAHA.116.309804
|
| [44] | Qiao Z, Zhou L, Huang JZ (2008) Effective linear discriminant analysis for high dimensional, low sample size data. in: Proceeding of the world congress on engineering . Citeseer 2-4. |
| [45] |
Yang S, Xiong H, Xu K, Wang L, Bian J, Sun Z (2021) Improving covariance-regularized discriminant analysis for ehr-based predictive analytics of diseases. Appl Intell 51: 377-395. https://doi.org/10.1007/s10489-020-01810-4
|
| [46] |
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84: 165-175. https://doi.org/10.1080/01621459.1989.10478752
|
| [47] |
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12: 95-116. https://doi.org/10.1007/s10115-006-0040-8
|
| [48] |
Dernoncourt D, Hanczar B, Zucker JD (2014) Analysis of feature selection stability on high dimension and small sample data. Comput Stat Data An 71: 681-693. https://doi.org/10.1016/j.csda.2013.07.012
|
| [49] | Grove W (2013) Using domain knowledge to systematically guide feature selection. in: Twenty-Third International Joint Conference on Artificial Intelligence . Citeseer. |
| [50] | Vaishnani K, Gohel B, Hati A (2021) Impact of stain normalisation technique on deep learning based nuclei segmentation in histopathological image. in: 2023 International Conference on Advances in Intelligent Computing and Applications (AICAPS) . IEEE 1-4. https://doi.org/10.1109/AICAPS57044.2023.10074363 |
| [51] | Ding C, Yao T, Wu C, Ni J (2024) Deep learning for personalized electrocardiogram diagnosis: A review. arxiv preprint arxiv:2409.07975 . https://doi.org/10.48550/arXiv.2409.07975 |