Current diagnostic methods for Parkinson's disease primarily rely on medical observations and the assessment of clinical signs. However, due to the mild nature of early symptoms, their overlap with those of other diseases, and the lack of distinct clinical markers, these early signs are often overlooked or misdiagnosed, thus making the early diagnosis of Parkinson's disease particularly challenging. This study aims to achieve a non-invasive and efficient early diagnosis of Parkinson's disease using deep learning methods based on acoustic signals, while enhancing the model's performance through hyperparameter optimization. All datasets in this study are sourced from publicly available online resources. After extracting multiple acoustic features, the data are merged into a unified dataset for model training. Among the four mainstream neural networks evaluated, long short-term memory (LSTM) demonstrates the best performance, thereby achieving an accuracy of 92.87%, a precision of 95.90%, a recall of 89.57%, and an F1-score of 92.57%. Following hyperparameter optimization using a combined Bayesian and chaotic optimization (BOCO) approach, the LSTM model's accuracy is improved to 94.22%, precision to 97.79%, recall to 90.52%, and F1-score to 93.97%. Consequently, the proposed method in this paper provides an innovative solution for a non-invasive, low-cost, and efficient diagnosis of Parkinson's disease, with broad application prospects.
Citation: Huanshuo Zhang, Lijun Pei. A deep learning approach for Parkinson's disease diagnosis based on acoustic signals: LSTM with Bayesian and chaotic optimization[J]. Big Data and Information Analytics, 2026, 10: 29-52. doi: 10.3934/bdia.2026002
Current diagnostic methods for Parkinson's disease primarily rely on medical observations and the assessment of clinical signs. However, due to the mild nature of early symptoms, their overlap with those of other diseases, and the lack of distinct clinical markers, these early signs are often overlooked or misdiagnosed, thus making the early diagnosis of Parkinson's disease particularly challenging. This study aims to achieve a non-invasive and efficient early diagnosis of Parkinson's disease using deep learning methods based on acoustic signals, while enhancing the model's performance through hyperparameter optimization. All datasets in this study are sourced from publicly available online resources. After extracting multiple acoustic features, the data are merged into a unified dataset for model training. Among the four mainstream neural networks evaluated, long short-term memory (LSTM) demonstrates the best performance, thereby achieving an accuracy of 92.87%, a precision of 95.90%, a recall of 89.57%, and an F1-score of 92.57%. Following hyperparameter optimization using a combined Bayesian and chaotic optimization (BOCO) approach, the LSTM model's accuracy is improved to 94.22%, precision to 97.79%, recall to 90.52%, and F1-score to 93.97%. Consequently, the proposed method in this paper provides an innovative solution for a non-invasive, low-cost, and efficient diagnosis of Parkinson's disease, with broad application prospects.
| [1] |
Brakedal B, Toker L, Haugarvoll K, Tzoulis C, (2022) A nationwide study of the incidence, prevalence and mortality of Parkinson's disease in the Norwegian population. NPJ Parkinson's Dis 8: 19. https://doi.org/10.1038/s41531-022-00280-4 doi: 10.1038/s41531-022-00280-4
|
| [2] |
Corti O, Lesage S, Brice A, (2011) What genetics tells us about the causes and mechanisms of Parkinson's disease. Physiol Rev 91: 1161–1218. https://doi.org/10.1152/physrev.00022.2010 doi: 10.1152/physrev.00022.2010
|
| [3] |
Ryu DW, Han K, Cho AH, (2023) Mortality and causes of death in patients with Parkinson's disease: A nationwide population-based cohort study. Front Neurol 14: 1236296. https://doi.org/10.3389/fneur.2023.1236296 doi: 10.3389/fneur.2023.1236296
|
| [4] |
Su D, Cui Y, He C, Yin P, Bai R, Zhu J, et al. (2025) Projections for prevalence of Parkinson's disease and its driving factors in 195 countries and territories to 2050: Modelling study of Global Burden of Disease Study 2021. BMJ 2025: 388. https://doi.org/10.1136/bmj-2024-080952 doi: 10.1136/bmj-2024-080952
|
| [5] |
Brooks DJ, (2012) Parkinson's disease: Diagnosis. Parkinsonism Relat Disord 18: S31–S33. https://doi.org/10.1016/S1353-8020(11)70012-8 doi: 10.1016/S1353-8020(11)70012-8
|
| [6] |
Lippa CF, Duda JE, Grossman M, Hurtig HI, Aarsland D, Boeve BF, et al. (2007) DLB and PDD boundary issues: Diagnosis, treatment, molecular pathology, and biomarkers. Neurology 68: 812–819. https://doi.org/10.1212/01.wnl.0000256715.13907.d3 doi: 10.1212/01.wnl.0000256715.13907.d3
|
| [7] |
Banerjee J, Taroni JN, Allaway RJ, Prasad DV, Guinney J, Greene C, (2023) Machine learning in rare disease. Nat Methods 20: 803–814. https://doi.org/10.1038/s41592-023-01886-z doi: 10.1038/s41592-023-01886-z
|
| [8] |
Arumugam K, Naved M, Shinde PP, Leiva-Chauca O, Huaman-Osorio A, Gonzales-Yanac T, (2023) Multiple disease prediction using Machine learning algorithms. Mater Today Proc 80: 3682–3685.https://doi.org/10.1016/j.matpr.2021.07.361 doi: 10.1016/j.matpr.2021.07.361
|
| [9] |
Bhatt CM, Patel P, Ghetia T, Mazzeo PL, (2023) Effective heart disease prediction using machine learning techniques. Algorithms 16: 88. https://doi.org/10.3390/a16020088 doi: 10.3390/a16020088
|
| [10] |
Kwekha-Rashid AS, Abduljabbar HN, Alhayani B, (2023) Coronavirus disease (COVID-19) cases analysis using machine-learning applications. Appl Nanosci 13: 2013–2025. https://doi.org/10.1007/s13204-021-01868-7 doi: 10.1007/s13204-021-01868-7
|
| [11] |
Islam MA, Majumder MZH, Hussein MA, (2023) Chronic kidney disease prediction based on machine learning algorithms. J Pathol Inf 14: 100189. https://doi.org/10.1016/j.jpi.2023.100189 doi: 10.1016/j.jpi.2023.100189
|
| [12] |
Ogunpola A, Saeed F, Basurra S, Albarrak AM, Qasem SN, (2024) Machine learning-based predictive models for detection of cardiovascular diseases. Diagnostics 14: 144. https://doi.org/10.3390/diagnostics14020144 doi: 10.3390/diagnostics14020144
|
| [13] |
Govindu A, Palwe S, (2023) Early detection of Parkinson's disease using machine learning. Procedia Comput Sci 218: 249–261. https://doi.org/10.1016/j.procs.2023.01.007 doi: 10.1016/j.procs.2023.01.007
|
| [14] |
Placido D, Yuan B, Hjaltelin JX, Zheng C, Haue AD, Chmura PJ, et al. (2023) A deep learning algorithm to predict risk of pancreatic cancer from disease trajectories. Nat Med 29: 1113–1122. https://doi.org/10.1038/s41591-023-02332-5 doi: 10.1038/s41591-023-02332-5
|
| [15] |
Inthiyaz S, Altahan BR, Ahammad SH, Rajesh V, Kalangi RR, Smirani LK, et al. (2023) Skin disease detection using deep learning. Adv Eng Software 175: 103361. https://doi.org/10.1016/j.advengsoft.2023.103361 doi: 10.1016/j.advengsoft.2023.103361
|
| [16] |
Tanveer M, Goel T, Sharma R, Malik AK, Beheshti I, Del Ser J, et al. (2024) Ensemble deep learning for Alzheimer's disease characterization and estimation. Nat Mental Health 2: 655–667. https://doi.org/10.1038/s44220-024-00237-x doi: 10.1038/s44220-024-00237-x
|
| [17] |
Groh M, Badri O, Daneshjou R, Koochek A, Harris C, Soenksen LR, et al. (2024) Deep learning-aided decision support for diagnosis of skin disease across skin tones. Nat Med 30: 573–583. https://doi.org/10.1038/s41591-023-02728-3 doi: 10.1038/s41591-023-02728-3
|
| [18] |
Chen J, Huang S, Zhang Y, Chang Q, Zhang Y, Li D, et al. (2024) Congenital heart disease detection by pediatric electrocardiogram based deep learning integrated with human concepts. Nat Commun 15: 976. https://doi.org/10.1038/s41467-024-44930-y doi: 10.1038/s41467-024-44930-y
|
| [19] |
Ávila-Jiménez JL, Cantón-Habas V, del Pilar Carrera-González M, Rich-Ruiz M, Ventura S, (2024) A deep learning model for Alzheimer's disease diagnosis based on patient clinical records. Comput Biol Med 169: 107814. https://doi.org/10.1016/j.compbiomed.2024.107814 doi: 10.1016/j.compbiomed.2024.107814
|
| [20] |
Mandava M, (2024) MDensNet201-IDRSRNet: Efficient cardiovascular disease prediction system using hybrid deep learning. Biomed Signal Process Control 93: 106147. https://doi.org/10.1016/j.bspc.2024.106147 doi: 10.1016/j.bspc.2024.106147
|
| [21] |
Celik G, (2023) CovidCoughNet: A new method based on convolutional neural networks and deep feature extraction using pitch-shifting data augmentation for covid-19 detection from cough, breath, and voice signals. Comput Biol Med 163: 107153. https://doi.org/10.1016/j.compbiomed.2023.107153 doi: 10.1016/j.compbiomed.2023.107153
|
| [22] |
Göreke V, (2023) A novel method based on Wiener filter for denoising Poisson noise from medical X-Ray images. Biomed Signal Process Control 79: 104031. https://doi.org/10.1016/j.bspc.2023.104031 doi: 10.1016/j.bspc.2023.104031
|
| [23] |
Liu X, Fu S, Lin B, Nie X, (2023) Windowed variation kernel Wiener filter model for image denoising with edge preservation. Opt Laser Technology 167: 109688. https://doi.org/10.1016/j.optlastec.2023.109688 doi: 10.1016/j.optlastec.2023.109688
|
| [24] |
Gorev V, Gusev A, Korniienko V, Shedlovska Y, (2023) On the use of the Kolmogorov-Wiener filter for heavy-tail process prediction. J Cyber Secur Mobility 12: 315–338. https://doi.org/10.13052/jcsm2245-1439.123.4 doi: 10.13052/jcsm2245-1439.123.4
|
| [25] |
Elkari B, Chaibi Y, Kousksou T, (2024) Random forest with feature selection and K-fold cross validation for predicting the electrical and thermal efficiencies of air based photovoltaic-thermal systems. Energy Rep 12: 988–999. https://doi.org/10.1016/j.egyr.2024.07.002 doi: 10.1016/j.egyr.2024.07.002
|
| [26] | Gorriz JM, Segovia F, Ramirez J, Ortiz A, Suckling J, (2024) Is K-fold cross validation the best model selection method for machine learning?, preprint, arXiv: 2401.16407. https://arXiv.org/abs/2401.16407 |
| [27] | Tolstikhin IO, Houlsby N, Kolesnikov A, Beyer L, Zhai X, Unterthiner T, et al. (2021) MLP-mixer: An all-MLP architecture for vision, preprint, arXiv: 2105.01601. https://doi.org/10.48550/arXiv.2105.01601 |
| [28] |
Zhu Q, Chen BY, Morgan N, Stolcke A, (2004) On using MLP features in LVCSR. Interspeech 2004: 921–924. https://doi.org/10.21437/Interspeech.2004-337 doi: 10.21437/Interspeech.2004-337
|
| [29] |
Safar AA, Salih DM, Murshid AM, (2023) Pattern recognition using the multi-layer perceptron (MLP) for medical disease: A survey. Int J Nonlinear Anal Appl 14: 1989–1998. https://doi.org/10.22075/ijnaa.2022.7114 doi: 10.22075/ijnaa.2022.7114
|
| [30] | Shiri FM, Perumal T, Mustapha N, Mohamed R, (2023) A comprehensive overview and comparative analysis on deep learning models: CNN, RNN, LSTM, GRU, preprint, arXiv: 2305.17473. https://arXiv.org/abs/2305.17473 |
| [31] |
Abumohsen M, Owda AY, Owda M, (2023) Electrical load forecasting using LSTM, GRU, and RNN algorithms. Energies 16: 2283. https://doi.org/10.3390/en16052283 doi: 10.3390/en16052283
|
| [32] |
Luo Y, Yu J, (2023) Music source separation with band-split RNN. IEEE/ACM Trans Audio Speech Lang Process 31: 1893–1901. https://doi.org/10.1109/TASLP.2023.3271145 doi: 10.1109/TASLP.2023.3271145
|
| [33] |
Yadav H, Thakkar A, (2024) NOA-LSTM: An efficient LSTM cell architecture for time series forecasting. Expert Syst Appl 238: 122333. https://doi.org/10.1016/j.eswa.2024.122333 doi: 10.1016/j.eswa.2024.122333
|
| [34] |
Alshingiti Z, Alaqel R, Al-Muhtadi J, Haq QEU, Saleem K, Faheem MH, (2023) A deep learning-based phishing detection system using CNN, LSTM, and LSTM-CNN. Electronics 12: 232. https://doi.org/10.3390/electronics12010232 doi: 10.3390/electronics12010232
|
| [35] |
Pan S, Yang B, Wang S, Guo Z, Wang L, Liu J, et al. (2023) Oil well production prediction based on CNN-LSTM model with self-attention mechanism. Energy 284: 128701. https://doi.org/10.1016/j.energy.2023.128701 doi: 10.1016/j.energy.2023.128701
|
| [36] | 46. Mim TR, Amatullah M, Afreen S, Yousuf MA, Uddin S, Alyami SA, et al. (2023) GRU-INC: An inception-attention based approach using GRU for human activity recognition. Expert Syst Appl 216: 119419. https://doi.org/10.1016/j.eswa.2023.119419 |
| [37] |
Mohsen S, (2023) Recognition of human activity using GRU deep learning algorithm. Multimedia Tools Appl 82: 47733–47749. https://doi.org/10.1007/s11042-023-15571-y doi: 10.1007/s11042-023-15571-y
|
| [38] |
Tan KL, Lee CP, Lim KM, (2023) RoBERTa-GRU: A hybrid deep learning model for enhanced sentiment analysis. Appl Sci 13: 3915. https://doi.org/10.3390/app13063915 doi: 10.3390/app13063915
|
| [39] |
Aghaabbasi M, Ali M, Jasiński M, Leonowicz Z, Novák T, (2023) On hyperparameter optimization of machine learning methods using a Bayesian optimization algorithm to predict work travel mode choice. IEEE Access 11: 19762–19774. https://doi.org/10.1109/ACCESS.2023.3247448 doi: 10.1109/ACCESS.2023.3247448
|
| [40] |
Guo J, Ranković B, Schwaller P, (2023) Bayesian optimization for chemical reactions. Chimia 77: 31–38. https://doi.org/10.2533/chimia.2023.31 doi: 10.2533/chimia.2023.31
|
| [41] |
Luo T, Xie J, Zhang B, Zhang Y, Li C, Zhou J, (2024) An improved levy chaotic particle swarm optimization algorithm for energy-efficient cluster routing scheme in industrial wireless sensor networks. Expert Syst Appl 241: 122780. https://doi.org/10.1016/j.eswa.2024.122780 doi: 10.1016/j.eswa.2024.122780
|
| [42] |
Peng T, Fu Y, Wang Y, Xiong J, Suo L, Nazir MS, et al. (2023) An intelligent hybrid approach for photovoltaic power forecasting using enhanced chaos game optimization algorithm and Locality sensitive hashing based Informer model. J Build Eng 78: 107635. https://doi.org/10.1016/j.jobe.2023.107635 doi: 10.1016/j.jobe.2023.107635
|
| [43] |
Mohamed AA, Kamel S, Hassan MH, Zeinoddini‐Meymand H, (2024) CAVOA: A chaotic optimization algorithm for optimal power flow with facts devices and stochastic wind power generation. IET Gener Transm Distrib 18: 121–144. https://doi.org/10.1049/gtd2.13076 doi: 10.1049/gtd2.13076
|
| [44] | Diallo R, Edalo C, Awe OO, (2024) Machine learning evaluation of imbalanced health data: a comparative analysis of balanced accuracy, MCC, and F1 score, In: Practical Statistical Learning and Data Science Methods: Case Studies from LISA 2020 Global Network, USA, Cham: Springer Nature Switzerland, 283–312. https://doi.org/10.1007/978-3-031-72215-8_12 |
| [45] |
Demir V, Citakoglu H, (2023) Forecasting of solar radiation using different machine learning approaches. Neural Comput Appl 35: 887–906. https://doi.org/10.1007/s00521-022-07841-x doi: 10.1007/s00521-022-07841-x
|
| [46] |
Maniaci A, Riela PM, Iannella G, Lechien JR, La Mantia I, De Vincentiis M, et al. (2023) Machine learning identification of obstructive sleep apnea severity through the patient clinical features: A retrospective study. Life 13: 702. https://doi.org/10.3390/life13030702 doi: 10.3390/life13030702
|
| [47] |
Li J, (2024) Area under the ROC Curve has the most consistent evaluation for binary classification. PLoS One 19: e0316019. https://doi.org/10.1371/journal.pone.0316019 doi: 10.1371/journal.pone.0316019
|
| [48] | Hogan J, Adams NM, (2023) On averaging ROC curves. Trans Mach Learn Res 2023. |
| [49] |
Dalal S, Seth B, Radulescu M, Cilan TF, Serbanescu L, (2023) Optimized deep learning with learning without forgetting (LwF) for weather classification for sustainable transportation and traffic safety. Sustainability 15: 6070. https://doi.org/10.3390/su15076070 doi: 10.3390/su15076070
|
| [50] | Ranjith J, Baskaran S, Adithya B, (2024) Mitigating catastrophic forgetting in deep learning models for sentiment analysis, In: 2024 Second International Conference on Advances in Information Technology (ICAIT) 1: 1–7. https://doi.org/10.1109/ICAIT61638.2024.10690454 |