We discuss the new paradigm of predictive health intelligence, based on the use of modern deep learning algorithms and big biomedical data, along the various dimensions of: a) its potential, b) the limitations it encounters, and c) the sense it makes. We conclude by reasoning on the idea that viewing data as the unique source of sanitary knowledge, fully abstracting from human medical reasoning, may affect the scientific credibility of health predictions.
Citation: Marco Roccetti. Predictive health intelligence: Potential, limitations and sense making[J]. Mathematical Biosciences and Engineering, 2023, 20(6): 10459-10463. doi: 10.3934/mbe.2023460
[1] | Lingmin Lin, Kailai Liu, Huan Feng, Jing Li, Hengle Chen, Tao Zhang, Boyun Xue, Jiarui Si . Glucose trajectory prediction by deep learning for personal home care of type 2 diabetes mellitus: modelling and applying. Mathematical Biosciences and Engineering, 2022, 19(10): 10096-10107. doi: 10.3934/mbe.2022472 |
[2] | Delong Cui, Hong Huang, Zhiping Peng, Qirui Li, Jieguang He, Jinbo Qiu, Xinlong Luo, Jiangtao Ou, Chengyuan Fan . Next-generation 5G fusion-based intelligent health-monitoring platform for ethylene cracking furnace tube. Mathematical Biosciences and Engineering, 2022, 19(9): 9168-9199. doi: 10.3934/mbe.2022426 |
[3] | Rachael C. Adams, Behnam Rashidieh . Can computers conceive the complexity of cancer to cure it? Using artificial intelligence technology in cancer modelling and drug discovery. Mathematical Biosciences and Engineering, 2020, 17(6): 6515-6530. doi: 10.3934/mbe.2020340 |
[4] | Vinh Huy Chau . Powerlifting score prediction using a machine learning method. Mathematical Biosciences and Engineering, 2021, 18(2): 1040-1050. doi: 10.3934/mbe.2021056 |
[5] | Xu Lu, Chuan Yang, Yujing Zhang, Shanqiu Huang, Li Li, Haoqun Chen, Long Gao, Yan Ma, Wei Song . Test method for health-related physical fitness of college students in mobile internet environment. Mathematical Biosciences and Engineering, 2019, 16(4): 2189-2201. doi: 10.3934/mbe.2019107 |
[6] | Ivan Izonin, Nataliya Shakhovska . Special issue: informatics & data-driven medicine-2021. Mathematical Biosciences and Engineering, 2022, 19(10): 9769-9772. doi: 10.3934/mbe.2022454 |
[7] | Hongjie Deng, Lingxi Peng, Jiajing Zhang, Chunming Tang, Haoliang Fang, Haohuai Liu . An intelligent aerator algorithm inspired-by deep learning. Mathematical Biosciences and Engineering, 2019, 16(4): 2990-3002. doi: 10.3934/mbe.2019148 |
[8] | Xiaoyan Zhao, Sanqing Ding . A multi-dimension information fusion-based intelligent prediction approach for health literacy. Mathematical Biosciences and Engineering, 2023, 20(10): 18104-18122. doi: 10.3934/mbe.2023804 |
[9] | Di Zhang, Bing Fan, Liu Lv, Da Li, Huijun Yang, Ping Jiang, Fangmei Jin . Research hotspots and trends of artificial intelligence in rheumatoid arthritis: A bibliometric and visualized study. Mathematical Biosciences and Engineering, 2023, 20(12): 20405-20421. doi: 10.3934/mbe.2023902 |
[10] | Yinhua Su . Visualization design of health detection products based on human-computer interaction experience in intelligent decision support systems. Mathematical Biosciences and Engineering, 2023, 20(9): 16725-16743. doi: 10.3934/mbe.2023745 |
We discuss the new paradigm of predictive health intelligence, based on the use of modern deep learning algorithms and big biomedical data, along the various dimensions of: a) its potential, b) the limitations it encounters, and c) the sense it makes. We conclude by reasoning on the idea that viewing data as the unique source of sanitary knowledge, fully abstracting from human medical reasoning, may affect the scientific credibility of health predictions.
As we have learned with the Covid-19 pandemic, anticipating a disease dynamics, worldwide or at a specific country-level, beyond monitoring infections, means to develop an analytical intelligence which is able to assess, compare and predict risks of outbreaks and threats to the global or regional health of populations [1,2,3,4]. With the term predictive intelligence, we refer to a new computing framework that draws upon the use of a sophisticated variety of statistical, mathematical and computational methods, ranging from traditional data mining and optimization techniques up to artificial intelligence (AI) and deep learning [5,6,7,8]. Not only are these techniques useful to analyze and interpret historical and current data, but they often become indispensable to the aim of understanding the relationships among different criteria and factors, thus guiding the decision making in almost every sector. These prediction intelligence technologies, based on machine and deep learning in particular, have recently seized the health context, especially for all those cases when, given a certain emerging health phenomenon: i) huge amounts of sparse sanitary data arrive from unreliable and heterogeneous sources, ii) quick interpretations and rapid responses are required to face with the most urgent implications of that phenomenon, iii) and, finally, the human ability to make accurate health predictions is impaired by the increasing number of the interacting dimensions [9,10,11]. In the following Sections of this Editorial, we discuss this notion of predictive health intelligence, based on the use of modern machine learning algorithms and of big biomedical data, along the dimensions of: a) its potential, b) the limitations it encounters, and c) the sense it makes.
Based on the use of big biomedical data, an interdisciplinary subject has recently emerged combining medicine, computational sciences, biology and mathematics. It primarily uses methods of sub-symbolic artificial intelligence, and great amounts of data, to intelligently understand the principles and the physiological mechanisms behind human diseases, providing a guidance for disease predictions, and medical diagnosis as well. Deep learning is a bright exemplar, in this context, that has overcome the disadvantages of other more traditional mathematical and computational methods, up to the point it has been used to map the concepts coded in electronic health data records of patients and clinical images, helping doctors to predicting outcomes, like the need of hospitalization (or re-hospitalization) and even mortality [12,13]. Moreover, the recent introduction of the so-called attention mechanisms has further helped, enabling deep learning models to focus from the multitude of medical data to know what information in that data contributes to a more accurate health prediction. Finally, not only can those AI-based methods account for the health conditions of a given individual, but with the exploitation of any available dataset, they are also able to incorporate various key social determinants of health as wider predictors. Artificial intelligence–powered algorithms, in fact, could predict future risks of particular racial, gender, ethnic sectors of a population, by considering social factors, like education and socioeconomic status, thus extending the reach of risk prediction, prevention, and treatment much beyond the perspective of the single individual's biology [14,15].
The majority of health prediction technologies builds on the principles of supervised learning to recognize data patterns and predict events. Unfortunately, unseen events cannot be predicted by a learning algorithm that has never received a specific training on events which have never occurred or are completely unexpected. Extending to the concept of unsupervised learning has not demonstrated there is an effective solution space to this problem, at least in the field of health predictions. Even more worrying is the condition when the prediction provides a wrong answer, in the erroneous confidence that it is right. Such a prediction failure may occur frequently, if we deal with incomplete or unrepresentative data, data of scarce quality or precision, ambiguous or biased sources of information [16]. From this point of view, electronic medical record data may be often at the basis of prediction failures, because they can be can be flawed for many motivations, including the limitation given by their time of validity. Finally, one should never disregard the fact that we are dealing with statistical machines (according to the definition provided by Noam Chomsky) that after taking enormous quantities of data and having searched for common patterns in it have become tremendously proficient at generating statistically probable outcomes [17]. Nonetheless, balancing the probability values inside those outputs in a way that may favor either false positives or false negatives is still an issue from which a misdiagnosis or an overdiagnosis can depend.
With the term sense making we intend the process by which human beings give meaning to a collective experience, with the aim of providing a rationale to what they are doing. From this perspective, we should never forget what machine learning research actually is, especially when it deals with the health of humans. Despite the huge help that statistics and machine learning cultures are giving towards the goal of predicting health, it would be dangerous if we could think that health predictions lie in the data alone. Judea Pearl, a famous Turing laureate, has been one of the first to dispute against this data-fitting ideology, in contrast with a so-called data-interpreting approach [18]. He warns us against the danger of idolizing the possibility of having a perfect prediction, simply taking as input all the data that we can collect. He informs us that a fully synthetic data-centric approach alone cannot rival with the human knowledge and the perfect balance between its implicit/explicit components. In fact, if we think about the nature of sanitary knowledge only in terms of process-generated data, while abstracting from other fundamentals notions, like those of theory or cause-effect relationship, we are going to run the empirical risks that AI mechanisms pose in terms of sampling bias to the input data, or in terms of inaccurate health predictions that arise from statistical outputs [19,20]. In contrast, restoring a balance between human reasoning and the data a given environment generates will provide means for a better interpretation of the reality, and hence better predictions [21,22].
This Editorial does not contain any private information on patients. Therefore, ethical approval is not required.
Predictive health intelligence has recently made giant steps forwards in providing accurate and credible health predictions, both at the level of single individuals and at the level of group of individuals, up to that of an entire population. These advances have been favored by the emergence of AI-powered mechanisms (e.g., deep learning) combined with the use of huge amounts of biomedical data available under different forms (electronic records, clinical images and others). In this Editorial, we have discussed about this phenomenon from different perspectives. We have concluded our discussion by maintaining that viewing data as the sole source of health knowledge, without any deeper scrutiny by means of medical experts with clinical skills, may raise serious concerns on the validity of the relative diagnosis/predictions, thus affecting the scientific credibility of the approach.
The author declares there is no conflict of interest.
[1] |
P. L. Bokonda, M. Sidibe, N. Souissi, K. Ouazzani-Touhami, Machine learning model for predicting epidemics, Computers, 12 (2023), 54. https://doi.org/10.3390/computers12030054 doi: 10.3390/computers12030054
![]() |
[2] |
L. Casini, M. Roccetti, A cross-regional analysis of the COVID-19 spread during the 2020 Italian vacation period: Results from three computational models are compared, Sensors, 20 (2020), 7319. https://doi.org/10.3390/s20247319 doi: 10.3390/s20247319
![]() |
[3] |
R. Cappi, L. Casini, D. Tosi, M. Roccetti, Questioning the seasonality of SARS-COV-2: A Fourier spectral analysis, BMJ Open, 12 (2022), e061602. https://doi.org/10.1136/bmjopen-2022-061602 doi: 10.1136/bmjopen-2022-061602
![]() |
[4] | M. Roccetti, Excess mortality and COVID-19 deaths in Italy: A peak comparison study Math. Biosci. Eng., 20 (2023), 7042–7055. https://doi.org/10.3934/mbe.2023304 |
[5] | S. Yang, F. Zhu, X. Ling, Q. Liu, P. Zhao, Intelligent health care: Applications of deep learning in computational medicine, Front. Genet., 12 (2021). https://doi.org/10.3389/fgene.2021.607471 |
[6] | L. Ma, F. Zhang, End-to-end predictive intelligence diagnosis in brain tumor using lightweight neural network. Appl. Soft Comput., 111 (2021), 107666. https://doi.org/10.1016/j.asoc.2021.107666 |
[7] |
P. Ni, G. Li, P. C. H. Kung, V. Chang, StaResGRU-CNN with CMedLMs: A stacked residual GRU-CNN with pre-trained biomedical language models for predictive intelligence, Appl. Soft Comput., 113 (2021), 107975. https://doi.org/10.1016/j.asoc.2021.107975 doi: 10.1016/j.asoc.2021.107975
![]() |
[8] |
N. Dong, M. Zhai, L. Chang, C. Wu, A self-adaptive approach for white blood cell classification towards point-of-care testing, Appl. Soft Comput., 111 (2021), 107709. https://doi.org/10.1016/j.asoc.2021.107709 doi: 10.1016/j.asoc.2021.107709
![]() |
[9] |
K. Chong, K. Li, Z. Guo, K. Ja, E. Leung, S. Zhao, et al., Dining-out behavior as a proxy for the superspreading potential of SARS-CoV-2 infections: Modeling analysis, JMIR Public Health Surveill., 9 (2023), e44251. https://doi.org/10.2196/44251 doi: 10.2196/44251
![]() |
[10] |
S. P. Philips, Artificial intelligence and predictive algorithms in medicine, Can. Fam. Phys., 68 (2022), 570-572. https://doi.org/10.46747/cfp.6808570 doi: 10.46747/cfp.6808570
![]() |
[11] |
S. Mirri, G. Delnevo, M. Roccetti, Is a COVID-19 second wave possible in Emilia-Romagna (Italy)? Forecasting a future outbreak with particulate pollution and machine learning, Computation, 8 (2020), 74. https://doi.org/10.3390/computation8030074 doi: 10.3390/computation8030074
![]() |
[12] |
L. Shen, L. Margolies, J. H. Rothstein, E. Fluder, R. McBride, et al., Deep Learning to Improve Breast Cancer Detection on Screening Mammography, Sci. Rep., 9 (2019), 12495. https://doi.org/10.1038/s41598-019-48995-4 doi: 10.1038/s41598-019-48995-4
![]() |
[13] |
W. Lotter, A. R. Diab, B. Haslam, J. G. Kim, G. Grisot, E. Wu, et al., Robust breast cancer detection in mammography and digital breast tomosynthesis using an annotation-efficient deep learning approach, Nature Med., 27 (2021), 244–249. https://doi.org/10.1038/s41591-020-01174-9 doi: 10.1038/s41591-020-01174-9
![]() |
[14] |
B. Seligman, S. Tuljapurkar, D. Rehkopf, Machine learning approaches to the social determinants of health in the health and retirement study, SSM Popul. Health, 4 (2018), 95–99. https://doi.org/10.1016/j.ssmph.2017.11.008 doi: 10.1016/j.ssmph.2017.11.008
![]() |
[15] |
C. R. Clark, M. J. Ommerborn, K. Moran, K. Brooks, J. Haas, et al., Predicting self-rated health across the life course: Health equity insights from machine learning models, J. Gen. Intern. Med., 36 (2021), 1181–1188. https://doi.org/10.1007/s11606-020-06438-1 doi: 10.1007/s11606-020-06438-1
![]() |
[16] | M. Roccetti, G. Delnevo, L. Casini, G. Cappiello, Is bigger always better? A controversial journey to the center of machine learning design, with uses and misuses of big data for predicting water meter failures, J. Big Data, 6 (2019). https://doi.org/10.1186/s40537-019-0235-y |
[17] | N. Chomsky, I. Roberts, J. Watumull, The false promise of ChatGPT, The New York Times, (2023). Available online, March, 8, 2023. |
[18] |
J. Pearl, Radical empiricism and machine learning research, J. Causal Infer., 9 (2021), 78–82. https://doi.org/10.1515/jci-2021-0006 doi: 10.1515/jci-2021-0006
![]() |
[19] | M. Roccetti, G. Delnevo, L. Casini, G. Cappiello, Modeling COVID-19 diffusion with intelligent computational techniques is not working, what are we doing wrong?, In: 4th International Conference on Human Interaction and Emerging Technologies: Future Applications, IHIET – AI (2021). https://doi.org/10.1007/978-3-030-74009-2_61 |
[20] |
G. Marcus, Hoping for the best as AI evolves, Commun. ACM, 64 (2023), 6–7. https://doi.org/10.1145/3583078 doi: 10.1145/3583078
![]() |
[21] |
F. Corradini, R. Gorrieri, M. Roccetti, Performance preorder and competitive equivalence, Acta Inform., 34 (1997), 805–835. https://doi.org/10.1007/s002360050107 doi: 10.1007/s002360050107
![]() |
[22] |
E. S. Davis, G. Marcus, Computational limits don't fully explain human cognitive limitations, Behav. Brain Sci., 43 (2020), E7. https://doi.org/10.1017/S0140525X19001651 doi: 10.1017/S0140525X19001651
![]() |
1. | M. Sobhana, Smitha Chowdary Ch, Sowmya Koneru, G. Krishna Mohan, K. Kranthi Kumar, Enhancement of patient's health prediction system in a graphical representation using digital twin technology, 2024, 1573-7721, 10.1007/s11042-024-19759-8 |