Introduction
Materials and methods
Patient selection and dataset
Subject characteristics | Overall (n = 880) | Training set (614/887, 70%) | Validation set (133/887, 15%) | Test set (133/887, 15%) | External Test Set (n = 96) |
---|---|---|---|---|---|
Age (years ± SD) | 33.1 ± 19.4 | 34.0 ± 19.9 | 31.8 ± 17.3 | 30.3 ± 18.5 | 31.7 ± 22.1 |
Sex (females) | 395 (44.9%) | 275 (44.8%) | 60 (45.1%) | 60 (45.1%) | 40 (41.7%) |
Malignant subtypes | 213 (24.2%) | 149 (24.3%) | 32 (24.1%) | 32 (24.1%) | 31 (32.3%) |
Chondrosarcoma | 87 (9.8%) | 61 (9.9%) | 13 (9.8%) | 13 (9.8%) | 11 (11.5%) |
Osteosarcoma | 34 (3.8%) | 24 (3.9%) | 5 (3.8%) | 5 (3.8%) | 7 (7.3%) |
Ewing’s sarcoma | 32 (3.6%) | 22 (3.6%) | 5 (3.8%) | 5 (3.8%) | 5 (5.2%) |
Plasma cell myeloma | 28 (3.2%) | 20 (3.3%) | 4 (3.0%) | 4 (3.0%) | 4 (4.2%) |
NHL B cell | 26 (2.9%) | 18 (2.9%) | 4 (3.0%) | 4 (3.0%) | 4 (4.2%) |
Chordoma | 6 (0.6%) | 4 (0.6%) | 1 (0.7%) | 1 (0.7%) | 0 (0%) |
Benign subtypes | 667 (75.8%) | 465 (75.7%) | 101 (75.9%) | 101 (75.9%) | 65 (67.7%) |
Osteochondroma | 228 (25.9%) | 160 (26.1%) | 34 (25.6%) | 34 (25.6%) | 16 (16.7%) |
Enchondroma | 153 (17.4%) | 107 (17.4%) | 23 (17.3%) | 23 (17.3%) | 12 (12.5%) |
Chondroblastoma | 19 (0.2%) | 13 (2.1%) | 3 (2.3%) | 3 (2.3%) | (2.1%) |
Osteoid osteoma | 19 (0.2%) | 13 (2.1%) | 3 (2.3%) | 3 (2.3%) | 1 (1.0%) |
Giant cell tumor of bone | 44 (4.7%) | 30 (4.6%) | 7 (5.0%) | 7 (5.0%) | 6 (6.2%) |
Non-ossifying fibroma | 34 (3.9%) | 24 (3.9%) | 5 (3.8%) | 5 (3.8%) | 7 (7.3%) |
Haemangioma | 12 (1.4%) | 8 (1.3%) | 2 (1.5%) | 2 (1.5%) | 3 (3.1%) |
Aneurysmal bone cyst | 82 (9.3%) | 58 (9.4%) | 12 (9.0%) | 12 (9.0%) | 8 (8.3%) |
Simple bone cyst | 24 (2.7%) | 16 (2.6%) | 4 (3.0%) | 4 (3.0%) | 5 (5.2%) |
Fibrous dysplasia | 52 (5.9%) | 36 (5.9%) | 8 (6.0%) | 8 (6.0%) | 5 (5.2%) |
Location | |||||
Torso/head | 118 (13.4%) | 79 (12.9%) | 16 (12.0%) | 23 (17.3%) | 16 (16.7%) |
Upper extremity | 234 (26.6%) | 166 (27.0%) | 28 (21.1%) | 40 (30.0%) | 29 (30.2%) |
Lower extremity | 528 (60.0%) | 369 (60.1%) | 89 (66.9%) | 70 (52.6%) | 51 (53.1%) |
Radiomic feature extraction and machine learning model development
Model evaluation and statistical analysis
Results
Radiomic feature evaluation and demographic information
Feature | AUC | Accuracy | Sensitivity | Specificity |
---|---|---|---|---|
age | 0.49 ± 0.01 | 0.58 ± 0.01 | 0.33 ± 0.09 | 0.67 ± 0.05 |
wavelet-LLH_firstorder_TotalEnergy | 0.64 ± 0.01 | 0.6 ± 0.05 | 0.73 ± 0.03 | 0.55 ± 0.08 |
wavelet-HHH_firstorder_TotalEnergy | 0.65 ± 0.02 | 0.61 ± 0.03 | 0.71 ± 0.04 | 0.57 ± 0.05 |
wavelet-LHH_firstorder_TotalEnergy | 0.63 ± 0.01 | 0.62 ± 0.01 | 0.71 ± 0.04 | 0.59 ± 0.03 |
wavelet-LLH_firstorder_Energy | 0.61 ± 0.01 | 0.58 ± 0.02 | 0.56 ± 0.05 | 0.59 ± 0.05 |
wavelet-HLH_firstorder_TotalEnergy | 0.65 ± 0.01 | 0.55 ± 0.03 | 0.76 ± 0.03 | 0.48 ± 0.05 |
wavelet-HHH_firstorder_Energy | 0.6 ± 0.01 | 0.59 ± 0.03 | 0.56 ± 0.1 | 0.59 ± 0.07 |
original_firstorder_TotalEnergy | 0.59 ± 0.02 | 0.52 ± 0.02 | 0.78 ± 0.03 | 0.42 ± 0.03 |
wavelet-LHL_firstorder_TotalEnergy | 0.6 ± 0.01 | 0.55 ± 0.01 | 0.67 ± 0.03 | 0.51 ± 0.03 |
wavelet-HLL_firstorder_TotalEnergy | 0.57 ± 0.02 | 0.52 ± 0.02 | 0.72 ± 0.03 | 0.45 ± 0.04 |
Machine learning model evaluation of combined radiomics and demographic information
Model architecture | Score | Demographic features | Radiomic features | Combined: radiomic + demographic features |
---|---|---|---|---|
RFC (200 estimators) | AUC | 0.75 | 0.73 | 0.76 |
Accuracy | 0.76 (101/133; 95% CI: 0.68, 0.83) | 0.59 (78/133; 95% CI: 0.50, 0.67) | 0.60 (80/133; 95% CI: 0.51, 0.69) | |
Sensitivity | 0.41 (13/32; 95% CI: 0.24, 0.59) | 0.84 (27/32; 95% CI: 0.67, 0.95) | 0.81 (26/32; 95% CI: 0.64, 0.93) | |
Specificity | 0.87 (88/101; 95% CI: 0.79, 0.93) | 0.5 (51/101; 95% CI: 0.40, 0.61) | 0.53 (54/101; 95% CI: 0.43, 0.63) | |
GNB | AUC | 0.72 | 0.68 | 0.68 |
Accuracy | 0.44 (59/133; 95% CI: 0.36, 0.53) | 0.76 (101/133; 95% CI: 0.68, 0.83) | 0.76 (101/133; 95% CI: 0.68, 0.83) | |
Sensitivity | 0.92 (29/32; 95% CI: 0.75, 0.98) | 0.44 (14/32; 95% CI: 0.26, 0.62) | 0.44 (14/32; 95% CI: 0.26, 0.62) | |
Specificity | 0.29 (29/101; 95% CI: 0.20, 0.39) | 0.86 (87/101; 95% CI: 0.78, 0.92) | 0.86 (87/101; 95% CI: 0.78, 0.92) | |
ANN (200, 100, 100) | AUC | 0.59 | 0.71 | 0.79 |
Accuracy | 0.67 (89/133; 95% CI: 0.58, 0.75) | 0.75 (100/133; 95% CI: 0.67, 0.82) | 0.80 (107/133; 95% CI: 0.73, 0.87) | |
Sensitivity | 0.38 (12/32; 95% CI: 0.21, 0.56) | 0.66 (21/32; 95% CI: 0.47, 0.81) | 0.75 (24/32; 95% CI: 0.57, 0.89) | |
Specificity | 0.76 (77/101; 95% CI: 0.67, 0.84) | 0.78 (79/101; 95% CI: 0.69, 0.86) | 0.82 (83/101; 95% CI: 0.73, 0.89) |
Machine learning model evaluation on the external test set and comparison with radiologists
Examples of correct and incorrect classifications by the best performing model
Discussion
Declarations
Ethics approval
Informed consent
Conflict of interest
Guarantor
Statistics and biometry
Methodology
-
retrospective
-
diagnostic or prognostic study
-
multicentre study