Machine learning based clinical decision support for heart disease prediction using structured patient data

Saeed Ur Rashid; Md Ismail Hossain Siddiqui; Farhad Uddin Mahmud; Md. Soebur Rahman; Abdul Aziz Kabir; Ramisa Samin Shammah

doi:10.32996/jcsts.2024.6.1.36

Research Article

Machine learning based clinical decision support for heart disease prediction using structured patient data

Authors

Saeed Ur Rashid Master of Business Administration in Data Analytics, Westcliff University, Irvine, California, USA
Md Ismail Hossain Siddiqui Master of Science in Engineering/Industrial Management, Westcliff University, Irvine, California, USA
Farhad Uddin Mahmud Master of Business Administration in Management Information Systems, International American University Los Angeles, California, USA
Md. Soebur Rahman Master of Business Administration in Management Information Systems, International American University Los Angeles, California, USA
Abdul Aziz Kabir Master of Business Administration in Data Analytics, Westcliff University, Irvine, California, USA
Ramisa Samin Shammah College of Technology and Engineering, Westcliff University, Irvine,USA

Abstract

Heart disease remains a leading cause of mortality worldwide, necessitating reliable and efficient predictive models for early diagnosis and clinical decision support. This study presents a comprehensive machine learning framework for heart disease prediction using a structured clinical dataset comprising 920 patient records with diverse demographic, physiological, and diagnostic attributes. The dataset includes both numerical and categorical features, requiring careful preprocessing and encoding to ensure compatibility across different model architectures.A range of classification algorithms is systematically evaluated, including Logistic Regression, K-Nearest Neighbors, Decision Tree, Random Forest, Support Vector Machine, Extreme Gradient Boosting, and Light Gradient Boosting Machine. Model performance is assessed using multiple evaluation metrics, including accuracy, precision, recall, F1-score, and receiver operating characteristic area under the curve, along with five-fold cross-validation to examine stability and generalization behavior.The experimental results demonstrate consistently high predictive performance across most models, with several approaches achieving near-perfect classification metrics and minimal variation across cross-validation folds. In contrast, K-Nearest Neighbors exhibits slightly lower performance, highlighting differences in sensitivity to local data structure. Analysis of feature distributions and pairwise relationships indicates strong separability between classes, particularly driven by clinically relevant variables such as chest pain type, exercise-induced angina, ST depression, and maximum heart rate.Further evaluation using confusion matrices, receiver operating characteristic curves, and precision–recall curves confirms the robustness of the predictive models and their ability to distinguish between diseased and non-diseased cases with high reliability. Despite the strong performance, the study acknowledges potential dataset-specific characteristics that may influence model behavior and emphasizes the importance of external validation for clinical deployment.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

6 (1)

DOI

https://doi.org/10.32996/jcsts.2024.6.1.36

Pages

340-350

Published

2024-02-25

Copyright

Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.

Journal of Computer Science and Technology Studies

Machine learning based clinical decision support for heart disease prediction using structured patient data

Authors

Abstract

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

6 (1)

DOI

https://doi.org/10.32996/jcsts.2024.6.1.36

Pages

340-350

Published

Copyright

Open access

Downloads

58

37

Keywords:

rightbar

submission

menus