Article contents
Stacking-Based Ensemble Learning for Prostate Cancer Prediction Using Tabular Clinical Data
Abstract
This study introduces ProstaEnsembleNet, a tabular learning framework designed to integrate diverse predictors for preliminary risk stratification based on epidemiological data and routinely collected clinical features. We utilized a public Kaggle prostate cancer prediction dataset comprising 29 predictors to benchmark various classical machine learning models, including Gradient Boosting, XGBoost, LightGBM, Random Forest, Support Vector Machine (SVM), Gaussian Naïve Bayes, and KNN, as well as deep tabular models such as TabNet and multilayer perceptron. Our preprocessing steps included categorical encoding and z-score normalization, while we addressed class imbalance using within-fold SMOTE to reduce resampling leakage. We evaluated performance using stratified 10-fold cross-validation, measuring accuracy, recall, F1-score, balanced error rate, and PR-AUC. Among the individual learners, LightGBM demonstrated strong sensitivity with a Recall of 0.9714 (±0.0051) and an F1 score of 0.9062 (±0.0025). The ProstaEnsembleNet’s stacking ensemble, featuring a logistic regression meta-learner, achieved the best overall performance with an Accuracy of 0.8390 (±0.0019), a Recall of 0.9839 (±0.0025), an F1 score of 0.9122 (±0.0011), and a PR-AUC of 0.8592 (±0.0058). This method significantly outperformed voting for F1 and recall in paired fold-wise testing (Holm-adjusted p-value = 0.008). Ablation analyses confirmed that SMOTE substantially enhances minority-sensitive metrics across models and that logistic regression serves as a stable meta-learner with negligible losses compared to more complex alternatives. These findings suggest that stacked ensembles are a robust decision-support approach for tabular prostate cancer risk prediction. However, external validation, calibration analysis, and prospective evaluation are crucial before clinical deployment.
Article information
Journal
Journal of Medical and Health Studies
Volume (Issue)
7 (4)
Pages
43-56
Published
Copyright
Copyright (c) 2026 https://creativecommons.org/licenses/by/4.0/
Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.

Aims & scope
Call for Papers
Article Processing Charges
Publications Ethics
Google Scholar Citations
Recruitment