BMC MEDICAL INFORMATICS AND DECISION MAKING, vol.26, no.1, 2026 (SCI-Expanded, Scopus)
BackgroundCytomegalovirus (CMV) End-Organ Disease (EOD) remains a significant complication in immunocompromised individuals, particularly transplant recipients and patients undergoing chemotherapy. Accurate prediction of CMV EOD is essential for timely intervention but remains challenging using traditional methods.ObjectiveThis study aimed to evaluate the diagnostic performance of machine learning (ML) algorithms in predicting CMV EOD using clinical and laboratory data from a diverse cohort of high-risk patients.MethodsA retrospective analysis was conducted on 227 adult patients with suspected CMV disease from January 2014 to December 2024. CMV EOD was confirmed by histopathologic examination or polymerase chain reaction (PCR) on tissue specimens. Clinical and laboratory data were randomly partitioned into a training set (75%) and a test set (25%) using stratified sampling to preserve the EOD prevalence. Four machine learning algorithms-artificial neural networks (ANN), XGBoost, support vector machines (SVM), and a majority-voting ensemble combining these three-were developed. Model performance was evaluated on the test set by calculating area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, precision, and diagnostic odds ratio (DOR).ResultsAmong the 227 patients, 21 (9.3%) were solid organ transplant recipients, 100 (44.0%) underwent hematopoietic stem-cell transplantation, and 56 (24.7%) had a history of chemotherapy. Overall, 53 patients (23.3%) were diagnosed with CMV EOD. Independent predictors included low body mass index, elevated CMV PCR viral load, the presence of graft-versus-host disease, and thrombocytopenia. The ANN model achieved the highest sensitivity (67%) and AUROC (0.75), whereas XGBoost and SVM demonstrated superior precision and a diagnostic odds ratio of 22.0. The ensemble majority-voting classifier yielded the best overall performance, with the highest accuracy (90%) and precision of 100%, indicating robust identification of true positive cases.ConclusionML models-particularly ensemble-based approaches-demonstrate high diagnostic accuracy in identifying CMV EOD and may offer valuable support for early risk stratification and clinical decision-making. Integration of these tools into routine practice could improve outcomes through timely, individualized care strategies in vulnerable transplant populations.