Early identification of patients at risk of advanced diabetic kidney disease (DKD) remains critical for timely intervention and prevention of progression. An analysis published in Frontiers in Endocrinology developed and validated an interpretable machine learning (ML)-based prediction model to identify individuals at higher risk of advanced DKD using routinely available clinical variables.
Variable selection was performed using the least absolute shrinkage and selection operator (LASSO) and recursive feature elimination (RFE), followed by model development across eight ML algorithms. Model performance was assessed using multiple metrics, including area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), F1 score, and Brier score, along with calibration and decision curve analyses. Key predictors included serum creatinine, age, hemoglobin, serum urea, alkaline phosphatase (ALP), uric acid (UA), platelet count, serum osmolality, serum bicarbonate, and monocyte count.
Among the evaluated models, logistic regression (LR) demonstrated strong predictive performance, with AUC values of 0.948 (95% confidence interval [CI], 0.920-0.975) in internal validation and 0.898 (95% CI, 0.883–0.913) in external validation. Calibration and decision curve analyses showed good agreement between predicted and observed risks.
These findings indicate that an interpretable LR-based model can support early risk identification in DKD using accessible clinical parameters, although further validation in broader populations may help clarify its generalizability.