Developing a novel framework using optimized active stacking and explainable AI for heart disease prediction


Javed A., Javaid N., Saudagar A. K. J., PAMUCAR D.

Computer Methods and Programs in Biomedicine, vol.274, 2026 (SCI-Expanded, Scopus) identifier identifier

  • Publication Type: Article / Article
  • Volume: 274
  • Publication Date: 2026
  • Doi Number: 10.1016/j.cmpb.2025.109169
  • Journal Name: Computer Methods and Programs in Biomedicine
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, Compendex, EMBASE, INSPEC, MEDLINE
  • Keywords: 10-fold cross validation, Bayesian optimization, Entropy-based active learning, Heart disease, Local interpretable model-agnostic explanations, Machine learning, Paired t-test, SHapley additive exPlanations, Stacking model
  • Open Archive Collection: Article
  • Azerbaijan State University of Economics (UNEC) Affiliated: Yes

Abstract

Background and Objective: Heart disease is still the top driver of death worldwide, and developing accurate, interpretable, and efficient predictive systems is essential to enable early diagnosis in time for effective intervention. Although significant efforts have been made, the existing machine learning approaches are subject to problems including class imbalance, high-dimensional input features, complex hyperparameter tuning problems, low classification accuracy, and a limited amount of labeled data available. This article intends to overcome these challenges with a componental framework that is capable of robust heart disease prediction. Methods: The proposed framework is composed of three components. First, the proximity-weighted random affine shadow sampling technique is applied to mitigate class imbalance by generating synthetic samples for the minority class. Second, principal component analysis reduces feature dimensionality while preserving essential information. Third, three novel models are developed: (i) a stacking model that combines k-nearest neighbors and naïve Bayes as base learners with logistic regression as the meta-learner; (ii) an Optimized Stacking Model with Bayesian Optimization (OSM-BO) for systematic hyperparameter tuning; and (iii) an Entropy-based Active Learning Optimized Stacking Model (EAL-OSM), which uses entropy-based sampling to select the most informative samples for annotation. A paired t-test and 10-fold cross validation are applied for statistical evaluation. Local interpretable model-agnostic explanations and Shapley additive explanations are employed to ensure interpretability and support decision transparency. Findings: The stacking model improves accuracy by 3.66%, precision by 6.33%, recall by 3.57%, and Precision–Recall Area Under the Curve (PR-AUC) by 6.90%, while reducing Hamming loss by 12.50%. OSM-BO yields further improvements: 7.32% in accuracy, 7.59% in precision, 11.90% in recall, and 9.20% in PR-AUC, with a 31.25% reduction in Hamming loss. EAL-OSM achieves the best results with gains of 9.76% in accuracy, 11.39% in precision, 13.10% in recall, 11.49% in PR-AUC, and Hamming loss decreases by 37.50%. Conclusions: The proposed componental framework exhibits promising gain in classification performance, statistical robustness, and explainability, thus providing a clinically practical solution to predict heart disease.