Introduction: Rehospitalization after kidney transplant is costly to patients and healthcare systems and is associated with poor outcomes. Few prediction model studies have examined whether inclusion of clinical notes data from the electronic medical record (EMR) enhances prediction of rehospitalization.
Methods: In a retrospective, observational study of first-time, adult kidney transplant recipients at a large, urban hospital in the Southeastern United States (2005-2015), we examined 30-day rehospitalization (30DR) using structured EMR and unstructured (i.e. clinical notes) data. We used natural language processing (NLP) methods on eight types of clinical notes and included terms in predictive models using unsupervised machine-learning approaches. Both the area under the Receiver Operating and Precision-Recall Curves (ROC and PRC, respectively) were used to determine and compare model accuracy, and 5-fold cross-validation tested model performance.
Results: Among 2,060 kidney transplant recipients, 30.7% were readmitted within 30 days. Predictive models using clinical notes did not meaningfully improve performance over previous models using structured data alone (AUROC 0.6821 [95% CI 0.6644, 0.6998]). Predictive models built using solely clinical notes performed worse than models using both clinical notes and structured data. The data that contributed to the top performing models were not identical but both included Structured Data and Progress Notes (AUROC 0.6902, [95% CI: 0.6699, 0.7105]).
Conclusions: Including new features from clinical notes in risk prediction models did not substantially increase predictive accuracy for 30DR for kidney transplant recipients. Future research should consider pooling data from multiple institutions to increase sample size and avoid overfitting models.
Kidney International Reports / 2023