Heart Disease Risk Prediction Using Machine Learning: A Data-Driven Approach for Early Diagnosis and Prevention

Irin Akter Liza; Shah Foysal Hossain; Afsana Mahjabin Saima; Sarmin Akter; Rubi Akter; Md Al Amin; Mitu Akter; Ayasha Marzan

doi:10.32996/bjns.2025.5.1.5

Research Article

Heart Disease Risk Prediction Using Machine Learning: A Data-Driven Approach for Early Diagnosis and Prevention

Authors

Irin Akter Liza College of Graduate and Professional Studies (CGPS), Trine University, Detroit, Michigan, USA. https://orcid.org/0009-0007-4485-8921
Shah Foysal Hossain School of IT, Washington University of Science and Technology, Alexandria, Virginia, USA. https://orcid.org/0009-0005-5443-786X
Afsana Mahjabin Saima Optometry (Faculty of Medicine), University of Chittagong, Chittagong, Bangladesh https://orcid.org/0009-0007-3535-3841
Sarmin Akter School of Business, International American University, Los Angeles, California, USA https://orcid.org/0009-0002-7823-1151
Rubi Akter Department of Law, Southeast University, Dhaka, Bangladesh https://orcid.org/0009-0000-0410-5705
Md Al Amin School of Business, International American University, Los Angeles, California, USA https://orcid.org/0009-0000-9484-5095
Mitu Akter Graduate School of International Studies, Ajou University, Yeongtong-gu, Suwon, Korea https://orcid.org/0009-0005-5883-7035
Ayasha Marzan Optometry (Faculty of Medicine), University of Chittagong, Chittagong, Bangladesh https://orcid.org/0009-0004-0113-4630

Abstract

Cardiovascular diseases continue to be a major cause of death worldwide and a major challenge to healthcare systems in both the developing and developed world. In the US alone, nearly a fifth of all deaths in a year are caused by cardiovascular diseases, which imposes a huge burden on public and economic resources. The chief aim of this work was to create and rigorously test machine learning models that are effective in the prediction of heart disease risk for various populations. Based on well-annotated datasets and well-labeled variables like age, systolic/diastolic blood pressure, cholesterol level, type of chest pain, and electrocardiogram results. We used the publicly accessible Cleveland Heart Disease data for this study on Heart Disease Risk Prediction Using Machine Learning. The data consisted of 303 patient records and 14 important attributes typical for cardiovascular health: age, sex, resting blood pressure, serum cholesterol, fasting blood sugar, resting electrocardiographic results, maximum heart rate achieved, exercise-induced angina, and ST depression caused by exercise, among others. The target variable marks the presence or absence of heart disease as labeled in the data using five categories, later binarized for classification purposes (1 = disease, 0 = no disease). To develop a strong predictive model for the identification of people vulnerable to heart disease, three established supervised classification algorithms have been adopted: Logistic Regression, Random Forest Classifier, and XG-Boost Classifier (Extreme Gradient Boosting). To determine the accuracy and reliability of the designed machine learning models for heart disease risk prediction, a battery of evaluation metrics was utilized that presented distinct insights into model performance. The XG-Boost model had a substantial training accuracy, followed very closely by a high test accuracy, which indicated good generalization to the unseen test data. The deployment of machine learning-based heart disease risk prediction models in preventive care represents a major push in the U.S. public healthcare sector. These models can easily be implemented within electronic health record systems utilized in clinics, hospitals, and primary care to automatically indicate high-risk individuals using real-time clinician data. Machine learning-driven heart disease prediction models also have transformative value in remote monitoring of health and telemedicine, which have emerged as big trends in the U.S., particularly in the aftermath of the COVID-19 pandemic. One of the key strengths of machine learning models is that they can provide customizable risk scores that are attuned to the multifaceted demographic profile of the United States. As machine and AI technologies continue to mature, there is increasing potential to expand their use to predict not only heart disease but also associated comorbid conditions such as stroke, metabolic syndrome, chronic kidney disease, and type 2 diabetes.

Article information

Journal

British Journal of Nursing Studies

Volume (Issue)

5 (1)

DOI

https://doi.org/10.32996/bjns.2025.5.1.5

Pages

38-54

Published

2025-05-12

How to Cite

Liza, I. A., Hossain, S. F., Saima, A. M., Akter, S., Akter, R., Amin, M. A., Akter, M., & Marzan, A. (2025). Heart Disease Risk Prediction Using Machine Learning: A Data-Driven Approach for Early Diagnosis and Prevention. British Journal of Nursing Studies, 5(1), 38-54. https://doi.org/10.32996/bjns.2025.5.1.5

British Journal of Nursing Studies

Heart Disease Risk Prediction Using Machine Learning: A Data-Driven Approach for Early Diagnosis and Prevention

Authors

Abstract

Article information

Journal

British Journal of Nursing Studies

Volume (Issue)

5 (1)

DOI

https://doi.org/10.32996/bjns.2025.5.1.5

Pages

38-54

Published

How to Cite

Downloads

574

397

Keywords:

rightbar

submission

menus