Erschienen in:
01.03.2016 | Original Article
Development and Validation of an Algorithm to Identify Nonalcoholic Fatty Liver Disease in the Electronic Medical Record
verfasst von:
Kathleen E. Corey, Uri Kartoun, Hui Zheng, Stanley Y. Shaw
Erschienen in:
Digestive Diseases and Sciences
|
Ausgabe 3/2016
Einloggen, um Zugang zu erhalten
Abstract
Background and Aims
Nonalcoholic fatty liver disease (NAFLD) is the most common cause of chronic liver disease worldwide. Risk factors for NAFLD disease progression and liver-related outcomes remain incompletely understood due to the lack of computational identification methods. The present study sought to design a classification algorithm for NAFLD within the electronic medical record (EMR) for the development of large-scale longitudinal cohorts.
Methods
We implemented feature selection using logistic regression with adaptive LASSO. A training set of 620 patients was randomly selected from the Research Patient Data Registry at Partners Healthcare. To assess a true diagnosis for NAFLD we performed chart reviews and considered either a documentation of a biopsy or a clinical diagnosis of NAFLD. We included in our model variables laboratory measurements, diagnosis codes, and concepts extracted from medical notes. Variables with P < 0.05 were included in the multivariable analysis.
Results
The NAFLD classification algorithm included number of natural language mentions of NAFLD in the EMR, lifetime number of ICD-9 codes for NAFLD, and triglyceride level. This classification algorithm was superior to an algorithm using ICD-9 data alone with AUC of 0.85 versus 0.75 (P < 0.0001) and leads to the creation of a new independent cohort of 8458 individuals with a high probability for NAFLD.
Conclusions
The NAFLD classification algorithm is superior to ICD-9 billing data alone. This approach is simple to develop, deploy, and can be applied across different institutions to create EMR-based cohorts of individuals with NAFLD.