Heart disease prediction using machine learning models
Main Article Content
Abstract
Heart disease remains one of the leading causes of death globally, with mortality rates continuing to rise each year. Early detection is critical to reducing the burden of this disease; however, conventional diagnostic methods are often costly, time-consuming, and reliant on specialist expertise. This study aims to evaluate the effectiveness of four machine learning (ML) algorithms—Decision Tree (DT), Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM)—in predicting heart disease using clinical datasets. The methodology involves data preprocessing, feature selection using the Random Forest algorithm, and performance evaluation through metrics such as accuracy, precision, recall, F1-score, and support. Experimental results indicate that KNN achieved the highest accuracy after feature selection, while SVM demonstrated the highest recall despite lower precision. RF offered the most balanced performance, making it a reliable model for real-world medical applications. These findings highlight the importance of selecting appropriate algorithms and features to improve the performance of predictive models. The study suggests that future research should incorporate larger datasets, apply systematic hyperparameter tuning, and explore deep learning techniques to further enhance prediction accuracy.
Downloads
Article Details
Althnian, A., AlSaeed, D., Al-Baity, H., Samha, A., Dris, A. Bin, Alzakari, N., Abou Elwafa, A., & Kurdi, H. (2021). Impact of dataset size on classification performance: An empirical evaluation in the medical domain. Applied Sciences (Switzerland), 11(2), 1–18. https://doi.org/10.3390/app11020796
Azam, Z., Islam, M. M., & Huda, M. N. (2023). Comparative Analysis of Intrusion Detection Systems and Machine Learning-Based Model Analysis Through Decision Tree. IEEE Access, 11, 80348–80391. https://doi.org/10.1109/ACCESS.2023.3296444
Balwan, Wahied Khawar, & Kour, S. (2021). Lifestyle Diseases: The Link between Modern Lifestyle and Threat to Public Health. Saudi Journal of Medical and Pharmaceutical Sciences, 7(4), 179–184. https://doi.org/10.36348/sjmps.2021.v07i04.003
Charbuty, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2(01), 20–28. https://doi.org/10.38094/jastt20165
Gaidai, O., Cao, Y., & Loginov, S. (2023). Global Cardiovascular Diseases Death Rate Prediction. Current Problems in Cardiology, 48(5), 101622. https://doi.org/10.1016/j.cpcardiol.2023.101622
Gulowaty, B., & Wozniak, M. (2021). Extracting Interpretable Decision Tree Ensemble from Random Forest. Proceedings of the International Joint Conference on Neural Networks, 2021-July, 1–8. https://doi.org/10.1109/IJCNN52387.2021.9533601
Halder, R. K., Uddin, M. N., Uddin, M. A., Aryal, S., & Khraisat, A. (2024). Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications. Journal of Big Data, 11(1), 113. https://doi.org/10.1186/s40537-024-00973-y
Javaid, M., Haleem, A., Pratap Singh, R., Suman, R., & Rab, S. (2022). Significance of machine learning in healthcare: Features, pillars and applications. International Journal of Intelligent Networks, 3, 58–73. https://doi.org/10.1016/j.ijin.2022.05.002
Kafrawy, P. El, Fathi, H., Qaraad, M., Kelany, A. K., & Chen, X. (2021). An Efficient SVM-Based Feature Selection Model for Cancer Classification Using High-Dimensional Microarray Data. IEEE Access, 9, 155353–155369. https://doi.org/10.1109/ACCESS.2021.3123090
Kairgeldin, R., & Carreira-Perpiñán, M. (2024). Bivariate Decision Trees: Smaller, Interpretable, More Accurate. Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1336–1347. https://doi.org/10.1145/3637528.3671903
Mallick, A., Hsieh, K., Arzani, B., & Joshi, G. (2022). Matchmaker: Data Drift Mitigation in Machine Learning for Large-Scale Systems. Proceedings of Machine Learning and Systems, 4(Ml), 77–94.
Nagavelli, U., Samanta, D., & Chakraborty, P. (2022). Machine Learning Technology-Based Heart Disease Detection Models. Journal of Healthcare Engineering, 2022(1), 7351061. https://doi.org/10.1155/2022/7351061
Noroozi, Z., Orooji, A., & Erfannia, L. (2023). Analyzing the impact of feature selection methods on machine learning algorithms for heart disease prediction. Scientific Reports, 13(1), 22588. https://doi.org/10.1038/s41598-023-49962-w
Pham, T., & Wagner, H. (2025). Fast Kd-trees for the Kullback--Leibler Divergence and other Decomposable Bregman Divergences. ArXiv Preprint ArXiv:2502.13425.
Rahim, A., Rasheed, Y., Azam, F., Anwar, M. W., Rahim, M. A., & Muzaffar, A. W. (2021). An Integrated Machine Learning Framework for Effective Prediction of Cardiovascular Diseases. IEEE Access, 9, 106575–106588. https://doi.org/10.1109/ACCESS.2021.3098688
ReFaey, K., Tripathi, S., Grewal, S. S., Bhargav, A. G., Quinones, D. J., Chaichana, K. L., Antwi, S. O., Cooper, L. T., Meyer, F. B., Dronca, R. S., Diasio, R. B., & Quinones-Hinojosa, A. (2021). Cancer Mortality Rates Increasing vs Cardiovascular Disease Mortality Decreasing in the World: Future Implications. Mayo Clinic Proceedings: Innovations, Quality & Outcomes, 5(3), 645–653. https://doi.org/10.1016/j.mayocpiqo.2021.05.005
Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random Forest Algorithm Overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/bjml/2024/007
Shehab, M., Abualigah, L., Shambour, Q., Abu-Hashem, M. A., Shambour, M. K. Y., Alsalibi, A. I., & Gandomi, A. H. (2022). Machine learning in medical applications: A review of state-of-the-art methods. Computers in Biology and Medicine, 145, 105458. https://doi.org/10.1016/j.compbiomed.2022.105458
Slade, P., Atkeson, C., Donelan, J. M., Houdijk, H., Ingraham, K. A., Kim, M., Kong, K., & Poggensee, K. L. (2024). On human-in-the-loop optimization of human–robot interaction. Nature, 633(September), 779–788. https://doi.org/10.1038/s41586-024-07697-2
Tahraoui, H., Amrane, A., Belhadj, A. E., & Zhang, J. (2022). Modeling the organic matter of water using the decision tree coupled with bootstrap aggregated and least-squares boosting. Environmental Technology and Innovation, 27, 102419. https://doi.org/10.1016/j.eti.2022.102419
Tang, L., Yang, J., Wang, Y., & Deng, R. (2023). Recent Advances in Cardiovascular Disease Biosensors and Monitoring Technologies. ACS Sensors, 8(3), 956–973. https://doi.org/10.1021/acssensors.2c02311
Triantafyllidis, A., Kondylakis, H., Katehakis, D., Kouroubali, A., Koumakis, L., Marias, K., Alexiadis, A., Votis, K., & Tzovaras, D. (2022). Deep Learning in mHealth for Cardiovascular Disease, Diabetes, and Cancer: Systematic Review. JMIR MHealth and UHealth, 10(4), e32344. https://doi.org/10.2196/32344
Van Smeden, M., Heinze, G., Van Calster, B., Asselbergs, F. W., Vardas, P. E., Bruining, N., De Jaegere, P., Moore, J. H., Denaxas, S., Boulesteix, A. L., & Moons, K. G. M. (2022). Critical appraisal of artificial intelligence-based prediction models for cardiovascular disease. European Heart Journal, 43(31), 2921–2930. https://doi.org/10.1093/eurheartj/ehac238
Yasmin, F., Shah, S. M. I., Naeem, A., Shujauddin, S. M., Jabeen, A., Kazmi, S., Siddiqui, S. A., Kumar, P., Salman, S., Hassan, S. A., Dasari, C., Choudhry, A. S., Mustafa, A., Chawla, S., & Lak, H. M. (2021). Artificial intelligence in the diagnosis and detection of heart failure: the past, present, and future. Reviews in Cardiovascular Medicine, 22(4), 1095–1113. https://doi.org/10.31083/j.rcm2204121
Zhang, X.-D. (2001a). Support Vector Machines ( SVM ) Support Vector Machines ( SVM ). In Gesture (Vol. 23, Issue 6, pp. 349–361). Springer.
Zhang, X.-D. (2001b). Support Vector Machines ( SVM ) Support Vector Machines ( SVM ). Gesture, 23(6), 349–361.
Zhou, J., You, D., Bai, J., Chen, X., Wu, Y., Wang, Z., Tang, Y., Zhao, Y., & Feng, G. (2023). Machine Learning Methods in Real-World Studies of Cardiovascular Disease. Cardiovascular Innovations and Applications, 7(1), 975. https://doi.org/10.15212/CVIA.2023.0011

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.