Ensemble-learning approach improves fracture prediction using genomic and phenotypic data
- Open Access
- 07.03.2025
- Original Article
Abstract
Introduction
Materials and methods
Data source and study participants
Ascertainment of major osteoporotic fracture and covariate assessment
Genotyping data
Machine learning models
Data division
Base learners
Tenfold cross-validation
Super Learner
Data analysis
Model evaluation
Genomic data contribution
Subgroup analysis
Results
Participant characteristics
Variable* | Participants with MOF (N = 451) | Participants without MOF (N = 4679) |
\(P\)-value**
|
|---|---|---|---|
Age (years), mean (SD) | 75.9 (6.19) | 73.6 (5.84) | < 0.001 |
Femoral neck BMD (g/cm2), mean (SD) | 0.71 (0.12) | 0.79 (0.13) | < 0.001 |
Total hip BMD (g/cm2), mean (SD) | 0.87 (0.14) | 0.96 (0.14) | < 0.001 |
Total spine BMD, mean (SD) | 0.99 (0.18) | 1.08 (0.19) | < 0.001 |
Height (cm) | 174 (6.65) | 174 (6.78) | 0.11 |
Weight (kg) | 80.6 (13.1) | 83.4 (13.3) | < 0.001 |
Race (White), n (%) | 420 (93.1%) | 4196 (89.7%) | 0.025 |
Smoking (current), n (%) | 18 (4.0%) | 153 (3.3%) | 0.58 |
Alcohol (yes), n (%) | 230 (51.0%) | 2413 (51.6%) | 0.83 |
Walking speed (m/s), mean (SD) | 1.05 (0.25) | 1.08 (0.28) | 0.017 |
Mobility limitations, mean (SD) | 0.26 (0.58) | 0.19 (0.5) | 0.007 |
Impairment of instrumental activities of daily living (score), mean (SD) | 0.49 (0.99) | 0.36 (0.85) | 0.005 |
Grip strength, mean (SD) | 35.5 (7.9) | 38.7 (8.1) | < 0.001 |
Speed of sound, mean (SD) | 1540 (31.8) | 1560 (35.8) | < 0.001 |
Serum calcium (mg/Dl), mean (SD) | 9.28 (0.44) | 9.32 (0.39) | 0.064 |
Model evaluation
Genomic data contribution
Subgroup analysis
Model weights
Base learner model | Optimal weights | ||
|---|---|---|---|
All participants (n = 1026) | White participants (n = 923) | Minorities participants (n = 103) | |
Bagging classifier | 0.041 | 0.025 | 0.069 |
Deep neural network | 0.118 | 0.126 | 0.239 |
Gradient boosting | 0.315 | 0.362 | 0.248 |
Gaussian Naïve Bayes | 0.104 | 0.096 | 0.085 |
K-nearest neighbor classifier | 0.098 | 0.105 | 0.128 |
Random forest | 0.235 | 0.228 | 0.182 |
Support vector classification | 0.089 | 0.058 | 0.049 |