Performance of generalized linear model net (GLMN), logistic regression (LR), classification and regression tree (CART), random forest (RF), adaboost (AB), logitboost (LB), support vector machine (SVM), and neural network (NN) obtained with complete case (CC) analysis. The values represent sensitivity, positive predictive value (PPV), negative predictive value (NPV), specificity and accuracy averaged over the values obtained on each resample.

Comparison of Machine Learning Techniques for Prediction of Hospitalization in Heart Failure Patients

Performance of generalized linear model net (GLMN), logistic regression (LR), classification and regression tree (CART), random forest (RF), adaboost (AB), logitboost (LB), support vector machine (SVM), and neural network (NN) obtained with complete case (CC) analysis. The values represent sensitivity, positive predictive value (PPV), negative predictive value (NPV), specificity and accuracy averaged over the values obtained on each resample.

Comparison of Machine Learning Techniques for Prediction of Hospitalization in Heart Failure Patients

Abstract

The present study aims to compare the performance of eight Machine Learning Techniques (MLTs) in the prediction of hospitalization among patients with heart failure, using data from the Gestione Integrata dello Scompenso Cardiaco (GISC) study. The GISC project is an ongoing study that takes place in the region of Puglia, Southern Italy. Patients with a diagnosis of heart failure are enrolled in a long-term assistance program that includes the adoption of an online platform for data sharing between general practitioners and cardiologists working in hospitals and community health districts. Logistic regression, generalized linear model net (GLMN), classification and regression tree, random forest, adaboost, logitboost, support vector machine, and neural networks were applied to evaluate the feasibility of such techniques in predicting hospitalization of 380 patients enrolled in the GISC study, using data about demographic characteristics, medical history, and clinical characteristics of each patient. The MLTs were compared both without and with missing data imputation. Overall, models trained without missing data imputation showed higher predictive performances. The GLMN showed better performance in predicting hospitalization than the other MLTs, with an average accuracy, positive predictive value and negative predictive value of 81.2%, 87.5%, and 75%, respectively. Present findings suggest that MLTs may represent a promising opportunity to predict hospital admission of heart failure patients by exploiting health care information generated by the contact of such patients with the health care system.

Publication
Journal of Clinical Medicine, (8)