Article contents
Using Machine Learning to Detect and Predict Insurance Gaps in U.S. Healthcare Systems
Abstract
Insurance coverage remains a cornerstone of access to healthcare in the United States, yet millions of individuals remain uninsured or underinsured, exacerbating health disparities and increasing financial strain on the healthcare system. This study investigates the potential of machine learning (ML) to detect and predict insurance gaps by analyzing multi-dimensional datasets comprising socioeconomic, demographic, geographic, and healthcare utilization variables. Utilizing advanced classification algorithms—including Random Forest, XGBoost, and logistic regression—this research develops a predictive framework capable of identifying individuals at risk of losing or lacking coverage. The model is trained on integrated datasets from public health surveys, electronic health records (EHRs), and state-level insurance enrollment databases. To ensure fairness and interpretability, SHAP (SHapley Additive Explanations) values are applied to assess feature importance and enhance transparency in algorithmic decisions. Additionally, unsupervised clustering methods, such as K-Means and DBSCAN, are employed to uncover latent population segments disproportionately affected by insurance instability. Results demonstrate that income volatility, employment type, geographic location, and prior healthcare access are among the most significant predictors of insurance gaps. This research contributes a novel approach to health equity by enabling policymakers, insurers, and public health professionals to identify at-risk populations preemptively and implement data-informed interventions aimed at reducing systemic coverage disparities in the U.S. healthcare landscape.
Article information
Journal
Journal of Computer Science and Technology Studies
Volume (Issue)
7 (7)
Pages
449-458
Published
Copyright
Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.