Decision trees and multi-level ensemble classifiers for neurological diagnostics

Herbert F. Jelinek; Jemal H. Abawajy; Andrei V. Kelarev; Morshed U. Chowdhury; Andrew Stranieri; Herbert F. Jelinek; Jemal H. Abawajy; Andrei V. Kelarev; Morshed U. Chowdhury; Andrew Stranieri

doi:10.3934/medsci.2014.1.1

AIMS Medical Science

2014, Volume 1, Issue 1: 1-12. doi: 10.3934/medsci.2014.1.1

Previous Article Next Article

Research article

Decision trees and multi-level ensemble classifiers for neurological diagnostics

1.
Centre for Research in Complex Systems and School of Community Health, Charles Sturt University, Albury, NSW, Australia;
2.
Biomedical Engineering, Khalifa University of Science, Technology and Research (KUSTAR), Abu Dhabi, United Arab Emirates;
3.
School of Information Technology, Deakin University, 221 Burwood Hwy, Melbourne, Victoria 3125, Australia;
4.
Centre for Informatics and Applied Optimisation, School of Science, Information Technology and Engineering, Federation University, P.O. Box 663, Ballarat, Victoria 3353, Australia

Received: 10 December 2013 Accepted: 23 June 2014 Published: 30 June 2014

Cardiac autonomic neuropathy (CAN) is a well known complication of diabetes leading to impaired regulation of blood pressure and heart rate, and increases the risk of cardiac associated mortality of diabetes patients. The neurological diagnostics of CAN progression is an important problem that is being actively investigated. This paper uses data collected as part of a large and unique Diabetes Screening Complications Research Initiative (DiScRi) in Australia with data from numerous tests related to diabetes to classify CAN progression. The present paper is devoted to recent experimental investigations of the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN. We present the results of experiments comparing the effectiveness of ADTree, J48, NBTree, RandomTree, REPTree and SimpleCart decision tree classifiers. Our results show that SimpleCart was the most effective for the DiScRi data set in classifying CAN. We also investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading, based on Ripple Down Rules as examples of ensemble classifiers. Further, we investigated the effectiveness of these ensemble methods as a function of the base classifiers, and determined that Random Forest performed best as a base classifier, and AdaBoost, Bagging and Decorate achieved the best outcomes as meta-classifiers in this setting. Finally, we investigated the meta-classifiers that performed best in their ability to enhance the performance further within the framework of a multi-level classification paradigm. Experimental results show that the multi-level paradigm performed best when Bagging and Decorate were combined in the construction of a multi-level ensemble classifier.
- diabetes,
- cardiac autonomic neuropathy,
- neurology,
- data mining,
- decision trees,
- ensemble classifiers,
- knowledge discovery
Citation: Herbert F. Jelinek, Jemal H. Abawajy, Andrei V. Kelarev, Morshed U. Chowdhury, Andrew Stranieri. Decision trees and multi-level ensemble classifiers for neurological diagnostics[J]. AIMS Medical Science, 2014, 1(1): 1-12. doi: 10.3934/medsci.2014.1.1

Related Papers:

Abstract

Cardiac autonomic neuropathy (CAN) is a well known complication of diabetes leading to impaired regulation of blood pressure and heart rate, and increases the risk of cardiac associated mortality of diabetes patients. The neurological diagnostics of CAN progression is an important problem that is being actively investigated. This paper uses data collected as part of a large and unique Diabetes Screening Complications Research Initiative (DiScRi) in Australia with data from numerous tests related to diabetes to classify CAN progression. The present paper is devoted to recent experimental investigations of the effectiveness of applications of decision trees, ensemble classifiers and multi-level ensemble classifiers for neurological diagnostics of CAN. We present the results of experiments comparing the effectiveness of ADTree, J48, NBTree, RandomTree, REPTree and SimpleCart decision tree classifiers. Our results show that SimpleCart was the most effective for the DiScRi data set in classifying CAN. We also investigated and compared the effectiveness of AdaBoost, Bagging, MultiBoost, Stacking, Decorate, Dagging, and Grading, based on Ripple Down Rules as examples of ensemble classifiers. Further, we investigated the effectiveness of these ensemble methods as a function of the base classifiers, and determined that Random Forest performed best as a base classifier, and AdaBoost, Bagging and Decorate achieved the best outcomes as meta-classifiers in this setting. Finally, we investigated the meta-classifiers that performed best in their ability to enhance the performance further within the framework of a multi-level classification paradigm. Experimental results show that the multi-level paradigm performed best when Bagging and Decorate were combined in the construction of a multi-level ensemble classifier.

References

[1]	Colagiuri S, Colagiuri R, Ward J (1998) National diabetes strategy and implementation plan. Canberra: Paragon Printers.
[2]	Pop-Busui R, Evans GW, Gerstein HC, et al. (2010) The ACCORD Study Group. Effects of cardiac autonomic dysfunction on mortality risk in the Action to Control Cardiovascular Risk in Diabetes (ACCORD) Trial. Diab Care 33: 1578-84.
[3]	Spallone V, Ziegler D, Freeman R, et al. (2011) Cardiovascular autonomic neuropathy in diabetes: clinical impact, assessment, diagnosis, and management. Diab Metab Res Rev 27:639-53. doi: 10.1002/dmrr.1239
[4]	Jelinek HF, Imam HM, Al-Aubaidy H, et al. (2013) Association of cardiovascular risk using nonlinear heart rate variability measures with the Framingham risk score in a rural population. Front Physiol 4: 186.
[5]	Ziegler D (1994) Diabetic cardiovascular autonomic neuropathy: prognosis, diagnosis and treatment. Diab Metabol Rev 10: 339-83. doi: 10.1002/dmr.5610100403
[6]	Gerritsen J, Dekker JM, TenVoorde BJ, et al. (2001) Impaired autonomic function is associated with increased mortality, especially in subjects with diabetes, hypertension or a history of cardiovascular disease. Diab Care 24: 1793-8. doi: 10.2337/diacare.24.10.1793
[7]	Johnston SC, Easton JD (2003) Are patients with acutely recovered cerebral ischemia more unstable? Stroke 4: 24-46.
[8]	Ko SH, Kwon HS, Lee JM, et al. (2006) Cardiovascular autonomic neuropathy in patients with type 2 diabetes mellitus. J Korean Diab Assoc 30: 226-35. doi: 10.4093/jkda.2006.30.3.226
[9]	Agelink MW, Malessa R, Baumann B, et al. (2001) Standardized tests of heart rate variability: normal ranges obtained from 309 healthy humans, and effects of age, gender and heart rate. Clin Auton Res 11: 99-108. doi: 10.1007/BF02322053
[10]	Ewing DJ, Martyn CN, Young RJ, et al. (1985) The value of cardiovascular autonomic functions tests: 10 years experience in diabetes. Diab Care 8: 491-8. doi: 10.2337/diacare.8.5.491
[11]	Pumprla J, Howorka K, Groves D, et al. (2002) Functional assessment of HRV variability: physiological basis and practical applications. Int J Cardiol 84: 1-14. doi: 10.1016/S0167-5273(02)00057-8
[12]	Stern S, Sclarowsky S (2009) The ECG in diabetes mellitus. Circulation 120: 1633-6. doi: 10.1161/CIRCULATIONAHA.109.897496
[13]	Reilly RB, Lee TC (2010) Electrograms (ECG, EEG, EMG, EOG). Technol Heal Care 18:443-58.
[14]	Baumert M, Schlaich MP, Nalivaiko E, et al. (2011) Relation between QT interval variability and cardiac sympathetic activity in hypertension. Am J Physiol Heart Circ Physiol 300: H1412-7. doi: 10.1152/ajpheart.01184.2010
[15]	Kelarev AV, Dazeley R, Stranieri A, et al. (2012) Detection of CAN by ensemble classifiers based on ripple down rules. Lect Notes Artif Int 7457: 147-59.
[16]	Fang ZY, Prins JB, Marwick TH (2004) Diabetic cardiomyopathy: evidence, mechanisms, and therapeutic implications. Endocrinol Rev 25: 543-67. doi: 10.1210/er.2003-0012
[17]	Huda S, Jelinek HF, Ray B, et al. (2010) Exploring novel features and decision rules to identify cardiovascular autonomic neuropathy using a Hybrid of Wrapper-Filter based feature selection. Marusic S, Palaniswami M, Gubbi J, et al, editors. Intelligent sensors, sensor networks and information processing, ISSNIP 2010. Sydney: IEEE Press, 297-302.
[18]	Witten H, Frank E (2011) Data mining: practical machine learning tools and techniques with java implementations. 3Eds, Sydney: Morgan Kaufmann, 2011.
[19]	Hall M, Frank E, Holmes G, et al. (2009). The WEKA data mining software: an update. SIGKDD Explor 11(1): 10-8. doi: 10.1145/1656274.1656278
[20]	Freund Y, Mason L (1999) The alternating decision tree learning algorithm. Proceedings of the sixteenth international conference on machine learning, 124-33.
[21]	Kotsiantis SB (2007) Supervised machine learning: A review of classification techniques. Informatica 31: 249-68.
[22]	Kohavi R (1996) Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. Proceedings of the 2nd international conference on knowledge discovery and data mining,202-7.
[23]	Breiman L, Friedman JH, Olshen RA, et al. (1984) Classification and regression trees. California: Wadsworth International Group.
[24]	Dazeley R, Yearwood J, Kang B, et al. (2010) Consensus clustering and supervised classification for profiling phishing emails in internet commerce secu+rity. In: Kang BH, Richards D, editors. Knowledge management and acquisition for smart systems and services, PKAW 2010. Daegu: Springer Verlag, 235-46.
[25]	Yearwood J, Webb D, Ma L, et al. (2009) Applying clustering and ensemble clustering approaches to phishing profiling. Proceedings of the 8th Australasian data mining conference, AusDM 2009. Curr Res Prac Inf Technol 101: 25-34.
[26]	Kang B, Kelarev A, Sale A, et al. (2006) A new model for classifying DNA code inspired by neural networks and FSA. Advances in Knowledge Acquisition and Management, 19th Australian Joint Conference on Artificial Intelligence, AI06. Lect Notes Comp Sci 4303:187-98.
[27]	Breiman L (1996) Bagging predictors. Mach Learn 24: 123-40.
[28]	Freund Y, Schapire R (1996) Experiments with a new boosting algorithm. Proceedings of 13th International Conference of Machince Learning 148-56.
[29]	Webb G (2000) Multiboosting: A technique for combining boosting and wagging. Mach Learn 40: 159-96. doi: 10.1023/A:1007659514849
[30]	Wolpert D (1992) Stacked generalization. Neural Networks 5: 241-59. doi: 10.1016/S0893-6080(05)80023-1
[31]	Melville P, Mooney R (2005) Creating diversity in ensembles using artificial data. Inf Fusion6: 99-111.
[32]	Ting K, Witten I (1997) Stacking bagged and dagged models. Fourteenth International Conference Machine Learning, 367-75.
[33]	Seewald AK, Furnkranz J (2001) An evaluation of grading classifiers. Hoffmann F, Adams N, Fisher D, et al, editors. Advances in intelligent data analysis, IDA 2001. Heidelberg: Springer,115-24.
[34]	Kelarev A, Stranieri A, Yearwood J, et al. (2012) Empirical study of decision trees and ensemble classifiers for monitoring of diabetes patients in pervasive healthcare, 15th International Conference on Networked-Based Information System, NBiS-2012. Melbourne: CPS, 441-6.
[35]	Huehn J, Huellermeier E (2009) FURIA: An algorithm for unordered fuzzy rule induction. DMKD 19: 293-319.
[36]	Kelarev A, Stranieri A, Yearwood J, et al. (2012) Improving classifications for cardiac autonomic neuropathy using multi-level ensemble classifiers and feature selection based on random forest. In: Zhao Y, Li J, Kennedy PJ, et al, editors. Data mining and analytics, 11th Australasian Data Mining Conference, AusDM 2012. Sydney: CRPIT, 134: 93-102.

Reader Comments

Your name:*

Email:*
© 2014 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)