Introduction
Developing and evaluating clinical machine learning models
Artificial intelligence (AI) and machine learning (ML)
Model training
Supervised and unsupervised learning
Machine learning algorithms
Assessing model performance
Interpretability and explainability
Data-driven advances in diabetes and cardiovascular disease
Targeted screening and risk stratification of prediabetes and diabetes
Computable phenotypes of patients with diabetes
Predicting CVD among patients with diabetes (from diagnosis to risk prediction)
Digital health for diabetes care optimization and personalization through predictive algorithms
AI-driven innovation in clinical trials and evidence generation
Detecting heterogeneous treatment effects
Towards smarter clinical trials
Causal inference from observational data
Key methodological considerations when interpreting ML models
Finding the best algorithm
Geographic and temporal drift in model performance
Promoting explainable AI
Statistical, ethical and regulatory concerns: promoting equitable and safe AI use
Ensuring good research practices
Mitigating bias through AI
Bias type | Definition: “A bias arising from…” | Example |
---|---|---|
Confirmation bias | A tendency to interpret data in a way that confirms our prior beliefs | A machine learning model confirms existing assumptions about certain broad phenotypic groups benefiting from a given therapy, potentially leading to unequal treatment and misdiagnosis |
Sampling bias | Non-random sampling which limits the generalizability of an algorithm | Enrolling patients who visit a particular clinic or location may not represent the broader diabetes population |
Algorithmic bias | The design and implementation of an algorithms that systematically discriminates against a given group | A blood pressure monitoring system that may provide consistently inaccurate readings for a given demographic group |
Aggregation bias | Drawing misleading conclusions about individuals from group data | Concluding all patients with type 2 diabetes and hypertension benefit from a given medication without considering individual variations |
Longitudinal data fallacy | Poor analysis of temporal data | Assessing quality of diabetes control and performing long-term risk prognostication using a single laboratory reading rather than long-term patterns |
Implicit bias | Unintentional embedding of underlying biases and prejudices in algorithms | A model that is trained using records from a specific racial or ethnic group may make inaccurate predictions and disproportionately misclassify individuals from other racial groups as having higher or lower risk of diabetic complications contributing to healthcare disparities |
User interaction bias | Both the user interface and the user's behavior | A diabetes management digital health app only collects voluntary input data, thus not capturing all relevant patient information |
Presentation bias | How information is displayed to users | A patient may miss important information on an app due to the information's placement at the bottom of the screen |
Emergent bias | Longitudinal changes in population, societal habits, norms, and practices over time | An outdated diabetes therapy might persist due to long-standing cultural beliefs |
Evaluation bias | The process of model evaluation | The effectiveness of a novel antihyperglycemic therapy is evaluated against a benchmark that favors a particular demographic |
Population bias | Differences in user characteristics between the training and the intended population | A diabetes management application initially tested among tech-savvy young adults may not adequately address the needs of older adults |