Article contents
A Predictive AI Framework for Cardiovascular Disease Screening in the U.S.: Integrating EHR Data with Machine and Deep Learning Models
Abstract
Cardiovascular disease (CVD) is the leading global cause of death, with over 18 million fatalities annually. Early and accurate diagnosis is essential to reduce its clinical and economic impact. This study presents an AI-driven framework for the early detection of CVD using structured data from electronic health records (EHRs). The Cleveland Heart Disease dataset was used to train and evaluate multiple supervised machine learning models, including Logistic Regression, Random Forest, SVM, KNN, and XGBoost. Comprehensive preprocessing steps were applied, such as feature normalization, missing value imputation, and one-hot encoding. Model performance was assessed using precision, recall, F1-score, and ROC-AUC, with XGBoost achieving the highest ROC-AUC score of 0.91. To support clinical interpretability, we employed feature importance analysis, ROC curves, and confusion matrices. The study confirms the potential of interpretable AI models to enhance diagnostic accuracy, facilitate early interventions, and integrate seamlessly into clinical decision support systems for proactive healthcare delivery.