Background
Methods
Study population and design
Outcome ascertainment
Candidate predictors
Statistical analysis
Preprocessing
Variable selection and full model development and comparison
Development of simpler models
Results
All (2007–2017) | Discovery set (2007–2016) | Temporal validation set (2017) | P value1 | |
---|---|---|---|---|
n = 30,474 | n = 27,240 | n = 3234 | ||
Age at childbirth, mean (SD), y | < 0.001 | |||
15–24 | 1624 (5.3) | 1497 (5.5) | 127 (3.9) | |
25–29 | 6057 (19.9) | 5522 (20.3) | 535 (16.5) | |
30–34 | 11,295 (37.1) | 10,018 (36.8) | 1277 (39.5) | |
≥ 35 | 11,498 (37.7) | 10,203 (37.5) | 1295 (40.0) | |
Race/Ethnicity, n (%) | 0.028 | |||
White | 6866 (22.5) | 6174 (22.7) | 692 (21.4) | |
Hispanic | 8506 (27.9) | 7655 (28.1) | 851 (26.3) | |
African American | 1319 (4.3) | 1174 (4.3) | 145 (4.5) | |
Asian/Pacific Islander | 12,377 (40.6) | 10,990 (40.3) | 1387 (42.9) | |
Other | 1406 (4.6) | 1247 (4.6) | 159 (4.9) | |
Pre-pregnancy body mass index, kg/m2, n (%) | < 0.001 | |||
Underweight | 399 (1.3) | 344 (1.3) | 55 (1.7) | |
Normal | 6850 (22.5) | 6147 (22.6) | 703 (21.7) | |
Overweight | 10,095 (33.1) | 9106 (33.4) | 989 (30.6) | |
Obese | 13,130 (43.1) | 11,643 (42.7) | 1487 (46.0) | |
Median household income, annual, n (%) | < 0.001 | |||
< $25,000 | 813 (2.7) | 562 (2.1) | 251 (7.8) | |
$25,000–39,999 | 2816 (9.2) | 2495 (9.2) | 321 (9.9) | |
$40,000–59,999 | 7169 (23.5) | 6463 (23.7) | 706 (21.8) | |
$60,000–79,999 | 7796 (25.6) | 7010 (25.7) | 786 (24.3) | |
≥ $80,000 | 11,880 (39.0) | 10,710 (39.3) | 1170 (36.2) | |
Nulliparity, n (%) | 12,419 (40.8) | 11,117 (40.8) | 1302 (40.3) | 0.559 |
Gestational age at delivery, mean (SD), weeks | 38.3 (1.9) | 38.3 (1.9) | 38.2 (1.9) | 0.05 |
Machine learning prediction methods comparison at different timings
Predictor levelsa | Dataset | AUC (95% CI) | |||
---|---|---|---|---|---|
CART | LASSO regression | Simple super learnerb | Complex super learnerc | ||
1 | Discovery set | 0.613 (0.603–0.622) | 0.670 (0.663–0.676) | 0.673 (0.667–0.679) | 0.683 (0.676–0.689) |
Validation set | 0.592 (0.567–0.616) | 0.634 (0.615–0.653) | 0.635 (0.615–0.654) | 0.634 (0.615–0.653) | |
1, 2 | Discovery set | 0.618 (0.609–0.628) | 0.685 (0.678–0.691) | 0.688 (0.682–0.695) | 0.761 (0.756–0.767) |
Validation set | 0.588 (0.563–0.613) | 0.647 (0.628–0.666) | 0.645 (0.626–0.664) | 0.648 (0.630–0.667) | |
1, 2, 3 | Discovery set | 0.740 (0.732–0.748) | 0.785 (0.780–0.791) | 0.790 (0.785–0.796) | 0.869 (0.865–0.873) |
Validation set | 0.703 (0.682–0.724) | 0.750 (0.733–0.767) | 0.749 (0.733–0.766) | 0.754 (0.739–0.772) | |
1, 2, 3, 4 | Discovery set | 0.785 (0.777–0.792) | 0.849 (0.845–0.854) | 0.852 (0.848–0.857) | 0.934 (0.931–0.936) |
Validation set | 0.745 (0.722–0.767) | 0.809 (0.794–0.823) | 0.808 (0.794–0.823) | 0.815 (0.800–0.829) |
Most influential features or predictors
Development and calibration of simpler models
Cross-validated AUC (95% CI) | Integrated calibration index | Calibrated AUC (95% CI) | ||
---|---|---|---|---|
Discovery set | Validation set | |||
Level 1a | 0.632 (0.623–0.640) | 0.609 (0.587–0.632) | 0.073 | 0.609 (0.587–0.632) |
Levels 1–2b | 0.648 (0.640–0.656) | 0.621 (0.599–0.643) | 0.075 | 0.621 (0.599–0.643) |
Levels 1–3c | 0.770 (0.764–0.775) | 0.746 (0.730–0.763) | 0.072 | 0.752 (0.734–0.77) |
Levels 1–4d | 0.825 (0.820–0.830) | 0.798 (0.783–0.813) | 0.038 | 0.802 (0.786–0.818) |
Level 1 | − 0.856 to 0.005 * history of GDM + 0.741 * BMI obese + 0.800 * prediabetes before pregnancy |
Levels 1–2 | − 1.001 + 0.572 * history of GDM + 0.579 * pre-pregnancy obesity + 0.774 * prediabetes before pregnancy + 0.733 * screening valuea − 0.323 * history of GDM * pre-pregnancy obesity − 0.577 * history of GDM * screening valuea + 0.480 * pre-pregnancy obesity * screening valuea |
Levels 1–3 | − 4.468 + 0.074 * oral glucose tolerance testb − 0.063 * week of gestational agec − 1.435 diagnosis by C–C criteriad |
Levels 1–4 | − 2.645 to 0.810 * meeting glycemic control goale + 0.167 * number of SMBG tests taken − 0.076 * week of gestational agec + 0.044 * oral glucose tolerance testb − 0.234 * meeting glycemic control goale * number of SMBG tests taken |