Research Article

AI-Driven Data Optimization: Automating Cleaning, Feature Engineering, and Augmentation for Superior Machine Learning Performance in Digital Health Care System

Authors

  • MD RUSSEL HOSSAIN Washington University of Science and Technology, Master of Science in Information Technology, USA
  • ESRAT ZAHAN SNIGDHA Washington University of Science and Technology, Master of Science in Information Technology, USA
  • SHOHONI MAHABUB Washington University of Science and Technology, Master of Science in Information Technology, USA

Abstract

The primary step of machine learning (ML) requires data preparation since it influences model accuracy along with its operational performance. Standard data preparation tasks that include data cleansing and feature engineering and augmentation require considerable manual labor from experts in addition to their domain expertise plus they are susceptible to human errors. Modern and sophisticated data science needs automated solutions because rising complexity and data volume demand better and easier ML systems. Five key areas of data management including cleaning and engineering are enhanced through automated artificial intelligence technologies for optimizing data preparation operations. AI models equipped with improved imputation techniques together with unsupervised anomaly detection algorithms clean data to address noisy data and missing values with greater efficiency than traditional approaches. Using machine learning techniques for feature engineering allows computers to automatically select and produce important features which reduces the requirement for expert personnel as well as human intervention. AI-powered techniques such as AutoML function to establish optimal features when human design of features is eliminated. GANs and VAEs alongside other generative models enable realistic synthetic data generation for data augmentation purposes which results in better generalization and alleviates data scarcity challenges. The investigation demonstrates through laboratory work alongside real-world case studies how these automated artificial intelligence systems boost both accuracy levels along with system durability and operational speed across multiple ranges of data. Together these results show how automation changes data preparation while giving scalable solutions for modern ML applications which can minimize the need for human intervention in data processing. This study shows that artificial intelligence creates essential approaches for building efficient and effective ML processes which boost model performance while speeding up deployment time.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

5 (4)

Pages

218-228

Published

2023-12-28

How to Cite

MD RUSSEL HOSSAIN, ESRAT ZAHAN SNIGDHA, & MAHABUB, S. (2023). AI-Driven Data Optimization: Automating Cleaning, Feature Engineering, and Augmentation for Superior Machine Learning Performance in Digital Health Care System. Journal of Computer Science and Technology Studies, 5(4), 218-228. https://doi.org/10.32996/jcsts.2023.5.4.23

Downloads

Views

19

Downloads

4

Keywords:

Data preprocessing, Missing data imputation, Synthetic data generation, Health Data Management, Automated machine learning workflows