Global–Local Attention Modeling for Reliable Multiclass Kidney Disease Classification from CT Images

Shahriar Ahmed; Md Rashel Miah; Mostafizur Rahman Shakil; Ahmed Ali Linkon; Md Ismail Hossain Siddiqui; Asif Hassan Malik

doi:10.32996/jmhs.2026.7.5.6

Research Article

Global–Local Attention Modeling for Reliable Multiclass Kidney Disease Classification from CT Images

Authors

Shahriar Ahmed School of Business, International American University, 3440 Wilshire Blvd STE 1000, Los Angeles, CA 90010, USA
Md Rashel Miah Department of Business Administration, Westcliff University, Irvine, CA 92614, USA
Mostafizur Rahman Shakil Department of Engineering Management, Westcliff University, Irvine, CA 92614, USA
Ahmed Ali Linkon Department of Computer Science, Westcliff University, Irvine, CA 92614, USA
Md Ismail Hossain Siddiqui Department of Engineering Management, Westcliff University, Irvine, CA 92614, USA
Asif Hassan Malik Department of Chemistry, York College, The City University of New York (CUNY), Jamaica, NY 11451, USA

Abstract

Automated analysis of kidney abnormalities from computed tomography (CT) has gained increasing importance as imaging volumes grow and radiological workloads intensify. Despite recent progress, robust multiclass classification remains challenging due to overlapping visual characteristics, acquisition variability, and class imbalance across renal conditions. In this work, we present an attention-driven framework for multiclass kidney disease classification from CT images. The proposed approach is based on a Vision Transformer (ViT-B/16) architecture that explicitly models global anatomical context while preserving discriminative local renal features. A comprehensive evaluation is conducted against established convolutional and modern CNN-based models, including ResNet50, DenseNet121, EfficientNetV2-S, and ConvNeXt-Tiny, using a CT kidney dataset containing 12,446 images spanning normal, cyst, stone, and tumor classes. The proposed model achieves the best overall performance, with 98.90% accuracy and a PR-AUC of 99.23%, demonstrating strong class-wise discrimination under imbalance. To promote transparency, gradient- and attention-based explainability techniques are employed to visualize lesion-relevant regions influencing predictions. The results indicate that transformer-based modeling offers an effective and interpretable solution for reliable CT-based kidney disease screening.

Article information

Journal

Journal of Medical and Health Studies

Volume (Issue)

7 (5)

DOI

https://doi.org/10.32996/jmhs.2026.7.5.6

Pages

36-45

Published

2026-03-08

Copyright

Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.

Journal of Medical and Health Studies

Global–Local Attention Modeling for Reliable Multiclass Kidney Disease Classification from CT Images

Authors

Abstract

Article information

Journal

Journal of Medical and Health Studies

Volume (Issue)

7 (5)

DOI

https://doi.org/10.32996/jmhs.2026.7.5.6

Pages

36-45

Published

Copyright

Open access

Downloads

172

128

Keywords:

rightbar

submission

menus