Research Article

AI-Enhanced Data Engineering for High-Performance Big-Data Pro-cessing and Advanced Analytics Optimization

Authors

  • Md Mahmudul Hasan epartment of School of Engineering, University of Bridgeport, Bridgeport, CT, United States https://orcid.org/0009-0003-2541-6109
  • Nudrat Fariha Department of School of Business, University of Bridgeport, Bridgeport, CT, United States
  • Tauhid Uddin Mahmood Department of School of Business, University of Bridgeport, Bridgeport, CT, United States
  • Abeera Rahman Department of Business Administration, Widener University, Chester, PA, United States

Abstract

The explosion of healthcare data presents a unique opportunity to derive actionable insights through AI-driven big-data engineering. This paper proposes an integrated framework that enhances traditional data engineering pipelines using artificial intelligence (AI) for high-performance big-data processing and advanced analytics op-timization. Leveraging the MIMIC-IV dataset and cutting-edge tools such as Apache Spark, Delta Lake, and machine learning algorithms, the study demonstrates how AI augments extract-transform-load (ETL) opera-tions, improves data quality, and accelerates analytics for clinical decision-making. Results indicate a 48% im-provement in processing speed and a 31% increase in prediction accuracy for patient outcomes compared to traditional approaches. This framework has significant implications for predictive healthcare, hospital resource management, and real-time diagnostics.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

7 (8)

Pages

724-732

Published

2025-08-07

How to Cite

Hasan, M. M., Nudrat Fariha, Tauhid Uddin Mahmood, & Abeera Rahman. (2025). AI-Enhanced Data Engineering for High-Performance Big-Data Pro-cessing and Advanced Analytics Optimization. Journal of Computer Science and Technology Studies, 7(8), 724-732. https://doi.org/10.32996/jcsts.2025.7.8.84

Downloads

Views

17

Downloads

18

Keywords:

AI-driven Data Engineering, Big Data Optimization, Advanced Analytics, Apache Spark, Machine Learning, Predictive Healthcare, ETL Pipelines, MIMIC-IV Dataset, Real-time Processing, Data Lakehouse.