Diagnosis of Heart Disease Using Data Mining Techniques: A Systematic Review of Influential Factors and Outcomes

Elahe Gozali, Sadrieh Hajesmaeel-Gohari, Kamal Khademvatani, Rahime Tajvidi Asr



Introduction: Heart disease is a major public health concern with millions of reported deaths annually. Data mining techniques have received attention in recent years as a tool aiding diagnosis and prediction of heart disease cases. This systematic review examines the application of data mining methods to cardiac disease diagnosis in order to identify specific types of heart-related disease that are diagnosed using data mining techniques as well as the most successful data mining methods.

Material and Methods: This study involved a systematic review of IEEE, Science Direct, Google Scholar, Web of Science, Scopus and MEDLINE databases from 2008 until April 2023. Inclusion criteria were original papers that used data mining methods for heart disease diagnosis. Non-English papers, those without full text, studies conducted on animals, and other types of papers (conference abstracts and letters) were excluded from the study. All the retrieved references were then assessed by title and abstract according to PRISMA, after which full texts of relevant articles were analyzed. The final sample comprised of 47 articles.

Results: Various classification methods have been utilized to diagnose heart-related disease using different mining tools, with genetic neural network data mining method having the highest accuracy among the studied techniques. Results show that predicting cardiac disease is the most commonly performed task. The demographic, bio-clinical, personal and exercise-related attributes, as well as other features used for classification were identified. The findings suggest that data mining methods hold great potential for detecting and preventing heart disease on both individual and population scales.

Conclusion: The study findings have implications for the prevention and treatment of cardiac disease, especially in high-risk individuals. Data mining methods can be widely applied to detect and prevent heart disease on a population scale, as well as supporting decisions for the most suitable treatment for individual patients to prevent death and reduce treatment costs.


Data Mining; Heart Disease; Features; Classification; Prediction;


Abdar M, Niakan Kalhori SR, Sutikno T, Subroto IMI, Arji G. Comparing performance of data mining algorithms in prediction heart diseases. International Journal of Electrical & Computer Engineering. 2015; 5(6): 1569-76.

Alotaibi N, Alzahrani M. Comparative analysis of machine learning algorithms and data mining techniques for predicting the existence of heart disease. International Journal of Advanced Computer Science and Applications. 2022; 13(7): 810-8.

Prabhavathi K, Mareeswari V. Diagnosis of cardiac disease utilizing machine learning techniques and dense neural networks. SN Computer Science. 2023; 4(5): 673.

Sayadi M, Varadarajan V, Sadoughi F, Chopannejad S, Langarizadeh M. A machine learning model for detection of coronary artery disease using noninvasive clinical parameters. Life (Basel). 2022; 12(11): 1933. PMID: 36431068 DOI: 10.3390/life12111933

Heravi M, Setayeshi S. Intelligent and fast recognition of heart disease based on synergy of‎ linear neural network and logistic regression model. Journal of Mazandaran University of Medical Sciences. 2014; 24(112): 78-87.

Murray CJ, Vos T, Lozano R, Naghavi M, Flaxman AD, Michaud C, et al. Disability-adjusted life years (DALYs) for 291 diseases and injuries in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. 2012; 380(9859): 2197-223. PMID: 23245608 DOI: 10.1016/S0140-6736(12)61689-4

Soni J, Ansari U, Sharma D, Soni S. Predictive data mining for medical diagnosis: An overview of heart disease prediction. International Journal of Computer Applications. 2011; 17(8): 43-8.

Tully PJ, Sardinha A, Nardi AE. A new CBT model of panic attack treatment in comorbid heart diseases (PATCHD): How to calm an anxious heart and mind. Cognitive and Behavioral Practice. 2017; 24(3): 329-41.

Bhatla N, Jyoti K. A novel approach for heart disease diagnosis using data mining and fuzzy logic. International Journal of Computer Applications. 2012; 54(17): 16-21.

Chavda P, Bhavsar H, Pithadia Y, Kotecha R. Early detection of cardiac disease using machine learning. International Conference on Advances in Science & Technology. University of Mumbai; 2019.

Sayadi M, Varadarajan V, Gozali E, Sadeghi M. Effective factors in diagnosing the degree of hepatitis C using machine learning. Frontiers in Health Informatics. 2023; 12: 137.

Chang V, Bhavani VR, Xu AQ, Hossain MA. An artificial intelligence model for heart disease detection using machine learning algorithms. Healthcare Analytics. 2022; 2: 100016.

Mathur P, Srivastava S, Xu X, Mehta JL. Artificial intelligence, machine learning, and cardiovascular disease. Clin Med Insights Cardiol. 2020; 14: 1179546820927404. PMID: 32952403 DOI: 10.1177/1179546820927404

Hadianfard Z, Afshar H, Nazarbaghi S, Rahimi B, Timpka T. Predicting mortality in patients with stroke using data mining techniques. Acta Informatica Pragensia. 2022; 11(1): 36-47.

Karami M, Fatehi M, Torabi M, Langarizadeh M, Rahimi A, Safdari R. Enhance hospital performance from intellectual capital to business intelligence. Radiol Manage. 2013; 35(6): 30-5. PMID: 24475528

Altaf I, Butt MA, Zaman M. Disease detection and prediction using the liver function test data: A review of machine learning algorithms. International Conference on Innovative Computing and Communications. Springer; 2022.

Banerji S, Mitra S. Deep learning in histopathology: A review. WIREs Data Mining and Knowledge Discovery. 2022; 12(1): e1439.

Srivastava AK, Jeberson K, Jeberson W. A systematic review on data mining application in Parkinson's disease. Neuroscience Informatics. 2022; 2(4): 100064.

Anil KS, Jain R. Data mining techniques in diabetes prediction and diagnosis: A review. International Conference on Trends in Electronics and Informatics. IEEE; 2022.

Maghooli K, Langarizadeh M, Shahmoradi L, Habibi-Koolaee M, Jebraeily M, Bouraghi H. Differential diagnosis of Erythmato-Squamous diseases using classification and regression tree. Acta Inform Med. 2016; 24(5): 338–42. PMID: 28077889 DOI: 10.5455/aim.2016.24.338-342

Sayadi M, Varadarajan V, Langarizadeh M, Bayazian G, Torabinezhad F. A systematic review on machine learning techniques for early detection of mental, neurological and laryngeal disorders using patient’s speech. Electronics (Switzerland). 2022; 11(24): 4235.

Kupusinac A, Stokic E, Kovacevic I. Hybrid EANN-EA system for the primary estimation of cardiometabolic risk. J Med Syst. 2016; 40(6): 138. PMID: 27106582 DOI: 10.1007/s10916-016-0498-1

Kupusinac A, Doroslovački R, Malbaški D, Srdić B, Stokić E. A primary estimation of the cardiometabolic risk by using artificial neural networks. Comput Biol Med. 2013; 43(6): 751-7. PMID: 23668351 DOI: 10.1016/j.compbiomed.2013.04.001

Tomar D, Agarwal S. A survey on data mining approaches for healthcare. International Journal of Bio-Science and Bio-Technology. 2013; 5(5): 241-66.

Latif J, Xiao C, Tu S, Rehman SU, Imran A, Bilal A. Implementation and use of disease diagnosis systems for electronic medical records based on machine learning: A complete review. IEEE Access. 2020; 8: 150489-513.

Sharif S, Ali MYJ. Outcome prediction in spinal cord injury: myth or reality. World Neurosurg. 2020; 140: 574-590. PMID: 32437998 DOI: 10.1016/j.wneu.2020.05.043

Alalawi HH, Alsuwat MS. Detection of cardiovascular disease using machine learning classification models. International Journal of Engineering Research & Technology. 2021; 10(7): 151-7.

Didona D, Quaglia F, Romano P, Torre E. Enhancing performance prediction robustness by combining analytical modeling and machine learning. SPEC International Conference on Performance Engineering. ACM; 2015.

Liu R, Wang M, Zheng T, Zhang R, Li N, Chen Z, et al. An artificial intelligence-based risk prediction model of myocardial infarction. BMC Bioinformatics. . 2022; 23(1): 217. PMID: 35672659 DOI: 10.1186/s12859-022-04761-4

Shehzadi S, Hassan MA, Rizwan M, Kryvinska N, Vincent K. Diagnosis of chronic ischemic heart disease using machine learning techniques. Comput Intell Neurosci. 2022; 2022: 3823350. PMID: 35747725 DOI: 10.1155/2022/3823350

Nagavelli U, Samanta D, Chakraborty P. Machine learning technology-based heart disease detection models. J Healthc Eng. 2022; 2022: 7351061. PMID: 35265303 DOI: 10.1155/2022/7351061

Morrill J, Qirko K, Kelly J, Ambrosy A, Toro B, Smith T, et al. A machine learning methodology for identification and triage of heart failure exacerbations. J Cardiovasc Transl Res. 2022; 15(1): 103-15. PMID: 34453676 DOI: 10.1007/s12265-021-10151-7

Ali MM, Paul BK, Ahmed K, Bui FM, Quinn JM, Moni MA. Heart disease prediction using supervised machine learning algorithms: Performance analysis and comparison. Comput Biol Med. 2021; 136: 104672. PMID: 34315030 DOI: 10.1016/j.compbiomed.2021.104672

Bharti R, Khamparia A, Shabaz M, Dhiman G, Pande S, Singh P. Prediction of heart disease using a combination of machine learning and deep learning. Comput Intell Neurosci. 2021; 2021: 8387680. PMID: 34306056 DOI: 10.1155/2021/8387680

Choi DJ, Park JJ, Ali T, Lee S. Artificial intelligence for the diagnosis of heart failure. NPJ Digit Med. 2020; 3: 54. PMID: 32285014 DOI: 10.1038/s41746-020-0261-3

Matsumoto T, Kodera S, Shinohara H, Ieki H, Yamaguchi T, Higashikuni Y, et al. Diagnosing heart failure from chest X-ray images using deep learning. Int Heart J. 2020; 61(4): 781-6. PMID: 32684597 DOI: 10.1536/ihj.19-714

Muhammad Y, Tahir M, Hayat M, Chong KT. Early and accurate detection and diagnosis of heart disease using intelligent computational model. Sci Rep. 2020; 10(1): 19747. PMID: 33184369 DOI: 10.1038/s41598-020-76635-9

Hussain L, Aziz W, Khan IR, Alkinani MH, Alowibdi JS. Machine learning based congestive heart failure detection using feature importance ranking of multimodal features. Math Biosci Eng. 2020; 18(1): 69-91. PMID: 33525081 DOI: 10.3934/mbe.2021004

Alimadadi A, Manandhar I, Aryal S, Munroe PB, Joe B, Cheng X. Machine learning-based classification and diagnosis of clinical cardiomyopathies. Physiol Genomics. 2020; 52(9): 391-400. PMID: 32744882 DOI: 10.1152/physiolgenomics.00063.2020

Ali L, Khan SU, Anwar M, Asif M. Early detection of heart failure by reducing the time complexity of the machine learning based predictive model. International Conference on Electrical, Communication, and Computer Engineering. IEEE; 2019.

Alotaibi FS. Implementation of machine learning model to predict heart failure disease. International Journal of Advanced Computer Science and Applications. 2019; 10(6): 261-8.

Tabassian M, Sunderji I, Erdei T, Sanchez-Martinez S, Degiovanni A, Marino P, et al. Diagnosis of heart failure with preserved ejection fraction: Machine learning of spatiotemporal variations in left ventricular deformation. J Am Soc Echocardiogr. 2018; 31(12): 1272-84. PMID: 30146187 DOI: 10.1016/j.echo.2018.07.013

Thomas J, Princy RT. Human heart disease prediction system using data mining techniques. Iternational Conference on Circuit, Power and Computing Technologies. IEEE; 2016.

Verma L, Srivastava S, Negi P. A hybrid data mining model to predict coronary artery disease cases using non-invasive clinical data. J Med Syst. 2016; 40(7): 178. PMID: 27286983 DOI: 10.1007/s10916-016-0536-z

Lakshmi KP, Reddy C. Fast rule-based heart disease prediction using associative classification mining. International Conference on Computer, Communication and Control. IEEE; 2015.

Dewan A, Sharma M. Prediction of heart disease using a hybrid technique in data mining classification. International Conference on Computing for Sustainable Global Development. IEEE; 2015.

Moses D, Deisy C. A survey of data mining algorithms used in cardiovascular disease diagnosis from multi-lead ECG data. Kuwait Journal of Science. 2015; 42(2): 206-35.

Melillo P, Izzo R, Orrico A, Scala P, Attanasio M, Mirra M, et al. Automatic prediction of cardiovascular and cerebrovascular events using heart rate variability analysis. PLoS One. 2015; 10(3): e0118504. PMID: 25793605 DOI: 10.1371/journal.pone.0118504

Subanya B, Rajalaxmi R. Feature selection using artificial bee colony for cardiovascular disease classification. International Conference on Electronics and Communication Systems. IEEE; 2014.

Bouali H, Akaichi J. Comparative study of different classification techniques: Heart disease use case. International Conference on Machine Learning and Applications. IEEE; 2014.

Yilmaz N, Inan O, Uzer MS. A new data preparation method based on clustering algorithms for diagnosis systems of heart and diabetes diseases. J Med Syst. 2014; 38(5): 48. PMID: 24737307 DOI: 10.1007/s10916-014-0048-7

Amin SU, Agarwal K, Beg R. Genetic neural network based data mining in prediction of heart disease using risk factors. IEEE Conference on Information & Communication Technologies. IEEE; 2013.

Sivagowry S, Durairaj M, Persia A. An empirical study on applying data mining techniques for the analysis and prediction of heart disease. International Conference on Information Communication and Embedded Systems. IEEE; 2013.

Nahar J, Imam T, Tickle KS, Chen Y-PP. Computational intelligence for heart disease diagnosis: A medical knowledge driven approach. Expert Systems with Applications. 2013; 40(1): 96-104.

Alizadehsani R, Habibi J, Hosseini MJ, Mashayekhi H, Boghrati R, Ghandeharioun A, et al. A data mining approach for diagnosis of coronary artery disease. Comput Methods Programs Biomed. 2013; 111(1): 52-61. PMID: 23537611 DOI: 10.1016/j.cmpb.2013.03.004

Peter TJ, Somasundaram K. An empirical study on prediction of heart disease using classification data mining techniques. International Conference on Advances in Engineering, Science and Management. IEEE; 2012.

AbuKhousa E, Campbell P. Predictive data mining to support clinical decisions: An overview of heart disease prediction systems. International Conference on Innovations in Information Technology. IEEE; 2012.

Atkov OY, Gorokhova SG, Sboev AG, Generozov EV, Muraseyeva EV, Moroshkina SY, et al. Coronary heart disease diagnosis by artificial neural networks including genetic polymorphisms and clinical parameters. J Cardiol. 2012; 59(2): 190-4. PMID: 22218324 DOI: 10.1016/j.jjcc.2011.11.005

Anooj P. Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules. Journal of King Saud University-Computer and Information Sciences. 2012; 24(1): 27-40.

Dangare CS, Apte SS. Improved study of heart disease prediction system using data mining classification techniques. International Journal of Computer Applications. 2012; 47(10): 44-8.

Hsieh NC, Hung LP, Shih CC, Keh HC, Chan CH. Intelligent postoperative morbidity prediction of heart disease using artificial intelligence techniques. J Med Syst. 2012; 36(3): 1809-20. PMID: 21184153 DOI: 10.1007/s10916-010-9640-7

Mandal I, Sairam N. Accurate prediction of coronary artery disease using reliable diagnosis system. J Med Syst. 2012; 36(5): 3353-73. PMID: 22327386 DOI: 10.1007/s10916-012-9828-0

Bhatla N, Jyoti K. An analysis of heart disease prediction using different data mining techniques. International Journal of Engineering. 2012; 1(8): 1-4.

Soni J, Ansari U, Sharma D, Soni S. Intelligent and effective heart disease prediction system using weighted associative classifiers. International Journal on Computer Science and Engineering. 2011; 3(6): 2385-92.

Srinivas K, Rao GR, Govardhan A. Analysis of coronary heart disease and prediction of heart attack in coal mining regions using data mining techniques. International Conference on Computer Science & Education. IEEE; 2010.

Khemphila A, Boonjing V. Comparing performances of logistic regression, decision trees, and neural networks for classifying heart disease patients. International Conference on Computer Information Systems and Industrial Management Applications. IEEE; 2010.

Chi CL, Street WN, Katz DA. A decision support system for cost-effective diagnosis. Artif Intell Med. 2010; 50(3): 149-61. PMID: 20933375 DOI: 10.1016/j.artmed.2010.08.001

Anbarasi M, Anupriya E, Iyengar N. Enhanced prediction of heart disease with feature subset selection using genetic algorithm. International Journal of Engineering Science and Technology. 2010; 2(10): 5370-6.

Das R, Turkoglu I, Sengur A. Effective diagnosis of heart disease through neural networks ensembles. Expert Systems with Applications. 2009; 36(4): 7675-80.

Lee HG, Noh KY, Ryu KH. A data mining approach for coronary heart disease prediction using HRV features and carotid arterial wall thickness. International Conference on Biomedical Engineering and Informatics. IEEE; 2008.

Palaniappan S, Awang R. Intelligent heart disease prediction system using data mining techniques. International Conference on Computer Systems and Applications. IEEE; 2008.

Srinivas K, Rani BK, Govrdhan A. Applications of data mining techniques in healthcare and prediction of heart attacks. International Journal on Computer Science and Engineering. 2010; 2(2): 250-5.

Learning M. Heart disease diagnosis and prediction using machine learning and data mining techniques: A review. Advances in Computational Sciences and Technology. 2017; 10(7): 2137-59.

Kumar AS. Diagnosis of heart disease using advanced fuzzy resolution mechanism. International Journal of Science and Applied Information Technology. 2013; 2(2): 22-30.

Krishnaiah V, Narsimha G, Chandra NS. Heart disease prediction system using data mining techniques and intelligent fuzzy approach: A review. International Journal of Computer Applications. 2016; 136(2): 43-51.

Patidar S, Pachori RB, Acharya UR. Automated diagnosis of coronary artery disease using tunable-Q wavelet transform applied on heart rate signals. Knowledge-Based Systems. 2015; 82: 1-10.

Song P, Rudan D, Zhu Y, Fowkes FJ, Rahimi K, Fowkes FGR, et al. Global, regional, and national prevalence and risk factors for peripheral artery disease in 2015: An updated systematic review and analysis. Lancet Glob Health. 2019; 7(8): e1020-30. PMID: 31303293 DOI: 10.1016/S2214-109X(19)30255-4

Lawson CA, Zaccardi F, Squire I, Okhai H, Davies M, Huang W, et al. Risk factors for heart failure: 20-year population-based trends by sex, socioeconomic status, and ethnicity. Circ Heart Fail. 2020; 13(2): e006472. PMID: 32059630 DOI: 10.1161/CIRCHEARTFAILURE.119.006472 PubMed]

Toh JZK, Pan X-H, Tay PWL, Ng CH, Yong JN, Xiao J, et al. A meta-analysis on the global prevalence, risk factors and screening of coronary heart disease in nonalcoholic fatty liver disease. Clin Gastroenterol Hepatol. 2022; 20(11): 2462-73. PMID: 34560278 DOI: 10.1016/j.cgh.2021.09.021

Dai H, Much AA, Maor E, Asher E, Younis A, Xu Y, et al. Global, regional, and national burden of ischaemic heart disease and its attributable risk factors, 1990–2017: Results from the Global Burden of Disease Study 2017. Eur Heart J Qual Care Clin Outcomes. 2022; 8(1): 50-60. PMID: 33017008 DOI: 10.1093/ehjqcco/qcaa076

Stubbs A, Kotfila C, Xu H, Uzuner Ö. Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2. J Biomed Inform. 2015; 58(Suppl): S67-77. PMID: 26210362 DOI: 10.1016/j.jbi.2015.07.001

DOI: https://doi.org/10.30699/fhi.v13i0.541


  • There are currently no refbacks.