Can an Ensemble Machine Learning Approach Outperform Traditional Models While Enhancing Accuracy, Fairness, and Interpretability in Clinical Risk Prediction?

Authors: Salma Barkaoui, Head of ML/AI (Qualees) · Mohammed Bennani (Qualees) · Sena Nur Bilgin, Data Scientist (Qualees) · Jérôme Vetillard, VP R&D & CPO (Qualees)

Background and objectives

Machine learning has transformed healthcare predictive analytics, enabling improved disease diagnosis and risk assessment. However, challenges like class imbalance, interpretability, and generalisability remain. This study focuses on ensemble learning to enhance model accuracy and robustness in healthcare predictions. The TweenMe suite provides three complementary tools: TweenMe Single for per-model hyperparameter fine-tuning, TweenMe Best for multi-algorithm evaluation and ranking, and TweenMe Ensemble for performance-weighted stacking.

Ensemble architecture and weighting

The ensemble architecture integrates multiple predictive models to reduce bias and variance. Models are assigned weights based on their performance, and the final prediction is determined by weighted aggregation. For classification, the prediction is the argmax of the weighted average of class probability vectors. For regression, it is the weighted average of individual predictions. This approach ensures that higher-accuracy models have proportional impact on the outcome.

OptiTween: Bayesian optimisation with OPTUNA

OptiTween implements Bayesian optimisation (Tree-structured Parzen Estimator) to efficiently navigate the hyperparameter space, balancing exploration and exploitation to minimise a composite objective function — a weighted loss aggregating multiple evaluation metrics. The ImprovementTracker component monitors trial performance in real time, enabling adaptive guidance of the optimisation process.

Resampling and class imbalance management

SMOTE (Synthetic Minority Over-sampling Technique) augments the minority class by interpolating between existing samples, while Tomek Links identifies and removes overlapping samples between classes. This resampling is integrated within the OptiTween framework, tested on the Iraqi Diabetes dataset exhibiting marked imbalance across three classes (non-diabetic, pre-diabetic, diabetic).

Classification and regression results

For classification, TweenMe Ensemble achieves: Accuracy 99.50%, Weighted F1 Score 99.51%, Macro F1 99.09%, Macro Recall 99.80%, Macro Precision 98.41%, ROC AUC 99.97%. These results outperform previously reported deep learning models in the literature. For regression on the Parkinson Telemonitoring dataset (motor scores), the Ensemble approach combines individual model strengths (Decision Tree, Random Forest, SVR, Ridge), reducing overfitting in this complex, high-dimensional task.

Conclusion

The TweenMe Digital Twin Bakery framework provides an integrated solution for systematically refining model performance, thereby improving prediction accuracy across complex clinical applications. The optimised ensemble outperforms individually tuned models, confirming the superiority of optimised, ensemble-based strategies for high-dimensional and/or imbalanced healthcare datasets.

Read the document

Access the full article

Enter your details to access the document. Free access — no sales outreach.

Personalized document · Free access · No sales outreach

Key takeaways

TweenMe Digital Twin Bakery tool suite: TweenMe Single (hyperparameter fine-tuning), TweenMe Best (multi-algorithm evaluation and ranking), TweenMe Ensemble (performance-weighted stacking).
OptiTween with OPTUNA: Bayesian optimisation (Tree-structured Parzen Estimator) navigating the hyperparameter space with adaptive real-time ImprovementTracker.
TweenMe Ensemble achieves Accuracy 99.50%, Weighted F1 99.51%, Macro Recall 99.80%, ROC AUC 99.97% — outperforming deep learning models reported in the literature.
SMOTE + Tomek Links resampling for class imbalance — tested on the Iraqi Diabetes dataset (3 classes: non-diabetic, pre-diabetic, diabetic).
In regression (Parkinson Telemonitoring, motor scores), the Ensemble approach combines individual model strengths, reducing overfitting and improving accuracy.
The TweenMe framework provides an integrated solution for systematic model selection and fine-tuning in medical machine learning.