Background
Methods
Search strategy
Machine learning terms were taken from textbooks on machine learning and represent the most commonly used models [10, 11]. Note that the search string contained terms to exclude studies of noncutaneous cancers.(Association rule OR Automat* detection OR Classification OR Classifier OR Computer-aided OR Computer-assisted OR Computer vision OR Cluster OR Bayes* OR Deep learning OR Decision tree OR Ensemble OR (Feature AND (extraction OR selection)) OR Genetic algorithm OR Inductive logic OR KNN OR K-means OR Machine learning OR Neural network OR Pattern recognition OR Regression OR Random forest OR Support vector) AND (Basal cell carcinoma OR Squamous cell carcinoma) AND (Skin OR Cutaneous OR Dermatolog*) AND (Dermatoscop* OR Dermoscop* OR Image OR Photograph* OR Picture)
Study selection
Quality assessment
Level of evidence | Definition |
---|---|
1 | Independent, blinded comparison of the classifier with a biopsy-proven standard among a large number of consecutive lesions suspected of being the target condition |
2 | Independent, blinded comparison of the classifier with a biopsy-proven standard among a small number of consecutive lesions suspected of being the target condition |
3 | Independent, blinded comparison of the classifier with a biopsy-proven standard among non-consecutive lesions suspected of being the target condition |
4 | Non-independent comparison of the classifier with a biopsy-proven standard among obvious examples of the target condition plus benign lesions |
5 | Non-independent comparison of the classifier with a standard of uncertain validity |
Results
Source | Target NMSC | Digital Image Modality | Database | Algorithm | Outcome | Quality Ratinga |
---|---|---|---|---|---|---|
Abbas, 2016 [14] | BCC; CSCC | Dermoscopic | 30 BCCs, 30 CSCCs, 300 various other lesions (30% of dataset used for training and 70% for testing)b | ANN | AUROC: 0.92, (BCC), 0.95 (CSCC); Sensitivity: 97% (BCC), 97% (CSCC); Specificity: 68% (BCC), 80% (CSCC) | 5 |
Ballerini, 2012 [43] | BCC; CSCC | Non-dermoscopic | 239 BCCs, 88 CSCCs, 633 benign lesions (3-fold cross-validation) | k-NNc | Accuracy: 89.7%d; Sensitivity: 89.9%d; Specificity: 89.6%d | 3 |
Chang 2013 [15] | BCC; CSCC | Non-dermoscopic | 110 BCCs, 20 CSCCs, 639 various other lesions (leave-one-out cross-validation)b | MSVM | Sensitivity: 90% (BCC), 80% (CSCC) | 2 |
Cheng, 2011 [34] | BCC | Dermoscopic | 59 BCCs, 152 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.967 | 4 |
Cheng, 2012 [36] | BCC | Dermoscopic | 263 BCCs, 226 benign lesions (10-fold cross-validation) | ANNc | AUROC: 0.846 | 4 |
Cheng, 2013 [37] | BCC | Dermoscopic | 35 BCCs, 79 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.902 | 4 |
Cheng, 2013 [40] | BCC | Dermoscopic | 350 BCCs, 350 benign lesions (10-fold cross-validation) | ANNc | AUROC: 0.981 | 2 |
Choudhury, 2015 [24] | BCC; CSCC | Dermoscopic; Non-dermoscopic | 359 BCCs, CSCCs, MMs, and AKs (40 from each class randomly chosen for training; remainder used for testing)b | MSVMc | Accuracy: 94.6% (BCC), 92.9% (CSCC) | 5 |
Chuang, 2011 [33] | BCC | Non-dermoscopic | 84 BCCs, 235 benign lesions (3-fold cross-validation) | ANN | Accuracy: 95.0%; Sensitivity: 94.4%; Specificity: 95.2% | 3 |
Dorj, 2018 [25] | BCC; CSCC | Non-dermoscopic | Training: 728 BCCs, 777 CSCCs, 768 MMs, and 712 AKs; Testing: 193 BCCs, 200 CSCCs, 190 MMs, and 185 AKs | ANN | Accuracy: 91.8% (BCC), 95.1% (CSCC); Sensitivity: 97.7% (BCC), 96.9% (CSCC); Specificity: 86.7% (BCC), 94.1% (CSCC) | 5 |
Esteva, 2017 [16] | BCC; CSCC | Dermoscopic; Non-dermoscopic | Training: 127463 various lesions (9-fold cross-validation); Testing: 450 BCCs and CSCCs, 257 SKs | ANN | AUROC: 0.96 | 3 |
Ferris, 2015 [17] | BCC; CSCC | Dermoscopic | 11 BCCs, 3 CSCCs, 39 MMs, 120 benign lesions (half used for training and half for testing) | Decision forest classifier | Sensitivity: 78.6% | 2 |
Fujisawa, 2018 [18] | BCC; CSCC | Non-dermoscopic | Training: 974 BCCs, 840 CSCCs, 3053 various other lesions; Testing: 249 BCCs, 189 CSCCs, 704 various other lesionsb | ANN | Sensitivity: 80.3% (BCC), 82.5% (CSCC) | 3 |
Guvenc, 2013 [47] | BCC | Dermoscopic | 68 BCCs, 131 benign lesions (no cross-validation) | Logistic regression | Accuracy: 96.5%; AUROC: 0.988 | 4 |
Han, 2018 [23] | BCC; CSCC | Non-dermoscopic | Training: 19398 various lesions; Testing: 499 BCCs, 211 CSCCs, 2018 various other lesionsb,e | ANN | AUROC: 0.96 (BCC), 0.91 (CSCC); Sensitivity: 88.8% (BCC), 90.2% (CSCC); Specificity: 91.7% (BCC); 80.0% (CSCC) | 3 |
Immagulate, 2015 [31] | BCC; CSCC | Non-dermoscopic | 100 BCCs, 100 CSCCs, 100 AKs, 100 SKs, 100 nevi (10-fold cross-validation) | MSVMc | Accuracy: 93% | 5 |
Kefel, 2012 [49] | BCC | Dermoscopic | 49 BCCs, 153 benign lesions (leave-one-out cross-validation) | ANN | AUROC: 0.925 | 4 |
Kefel, 2016 [38] | BCC | Dermoscopic | Training: 100 BCCs, 254 benign lesions; Testing: 304 BCCs, 720 benign lesions | Logistic regression | AUROC: 0.878 | 2 |
Kharazmi, 2011 [48] | BCC | Dermoscopic | 299 BCCs, 360 benign lesions (no cross-validation) | Random forest classifier | AUROC: 0.903 | 4 |
Kharazmi, 2016 [50] | BCC | Dermoscopic | 299 BCCs, 360 benign lesions (no cross-validation) | Random forest classifier | AUROC: 0.965 | 4 |
Kharazmi, 2017 [51] | BCC | Dermoscopic | Training: 149 BCCs, 300 benign lesions; Testing: 150 BCCs, 300 benign lesions | ANN | AUROC: 0.911; Sensitivity: 85.3%; Specificity: 94.0% | 3 |
Kharazmi, 2018 [52] | BCC | Dermoscopic | 295 BCCs; 369 benign lesions (10-fold cross-validation) | Random forest classifier | AUROC: 0.832; Sensitivity: 74.9%; Specificity: 77.8% | 3 |
Lee 2018 [29] | BCC | Non-dermoscopic | Training: 463 BCCs, 1914 various lesions; Testing: 51 BCCs, 950 various lesionsb | ANN | Sensitivity: 91% | 3 |
Maurya, 2014 [19] | BCC; CSCC | Dermoscopic; Non-dermoscopic | 84 BCCs, 101 CSCCs, 77 MMs, 101 AKs (75 from each class used for training; remainder used for testing)b | MSVM | Accuracy: 83.3% (BCC), 84.1% (CSCC) | 5 |
Mishra, 2017 [39] | BCC | Dermoscopic | 305 BCCs, 718 benign lesions (leave-one-out cross-validation) | Logistic regression | Accuracy: 72%f | 3 |
Møllersen, 2015 [30] | BCC; CSCC | Dermoscopic | Training: 37 MMs, 169 various lesionsg; Testing: 71 BCCs, 7 CSCCs, 799 various lesionsb | Hybrid model of linear and quadratic classifiersc | Sensitivity: 100%; Specificity: 12% | 2 |
Shakya, 2012 [41] | CSCC | Dermoscopic | 53 CSCCs, 53 SKs (no cross-validation) | Logistic regression | AUROC: 0.991 | 4 |
Shimizu, 2014 [20] | BCC | Dermoscopic | 69 BCCs, 105 MMs, 790 benign lesions (10-fold cross-validation)b | Layered model of linear classifiersc | Sensitivity: 82.6% | 3 |
Shoieb, 2016 [26] | BCC | Non-dermoscopic | Training: 84 NMSC, 119 MMs; Testing: 64 BCC, 72 MM, 74 eczema, 31 impetigo | MSVM | Accuracy: 96.2%; Specificity: 96.0%; Sensitivity: 88.9% | 5 |
Stoecker, 2009 [35] | BCC | Dermoscopic | 42 BCCs, 168 various lesions(leave-one-out cross-validation)b | ANN | AUROC: 0.951 | 2 |
Sumithra, 2015 [21] | CSCC | Non-dermoscopic | 31 CSCCs, 31 MMs, 33 SKs, 26 bullae, 20 shingles (70% used for training; remainder used for testing)b | Hybrid model of MSVM and k-NN classifiersc | F-measure: 0.581 | 5 |
Upadhyay, 2018 [27] | BCC; CSCC | Non-Dermoscopic | 239 BCCs, 88 CSCCs, 973 various lesions (24 from each class used for training; remainder used for testing)b | ANN | Accuracy: 96.6% (BCC), 81.2% (CSCC); Sensitivity: 96.8% (BCC), 80.5% (CSCC) | 3 |
Wahab, 2003 [32] | BCC | Non-Dermoscopic | 54 BCCs, 54 DLE, 54 AV (34 from each class used for training; remainder used for testing) | ANN | Sensitivity: 90% | 5 |
Wahba, 2017 [22] | BCC | Dermoscopic | 29 BCCs, 27 nevi (46 total used for training and 10 for testing) | MSVM | Accuracy: 100%; Sensitivity: 100%; Specificity: 100% | 5 |
Wahba, 2018 [42] | BCC | Dermoscopic | 300 BCCs, 300 MMs, 300 nevi, 300 SKs (fivefold cross-validation)b | MSVM | AUROC: Sensitivity: 100%; Specificity: 100% | 3 |
Yap, 2018 [28] | BCC | Dermoscopic; Non-Dermoscopic | 647 BCCs, 2270 various lesions (fivefold cross-validation)b | ANN | Accuracy: 91.8%; Sensitivity: 90.6%; Specificity: 92.3% | 3 |
Zhang, 2017 [44] | BCC | Dermoscopic | 132 BCCs, 132 nevi, 132 SKs, 132 psoriasis (80% used for training; remainder used for testing) | ANN | Accuracy: 92.4%d; Sensitivity: 85%d; Specificity: 94.8%d | 3 |
Zhang, 2018 [45] | BCC | Dermoscopic | 132 BCCs, 132 nevi, 132 SKs, 132 psoriasis (10-fold cross-validation) | ANN | Accuracy: 94.3%d; Sensitivity: 88.2%d; Specificity: 96.1%d | 3 |
Zhou, 2017 [46] | BCC | Dermoscopic | Training: 154 BCCs, 10,262 benign lesions; Testing: 50 BCCs, 1100 benign lesions | ANN | Accuracy: 96.8%d; Sensitivity: 38%; Specificity: 99.5%d | 3 |
Skin lesion databases
Methods of feature extraction
Methods of classification
Risk of Bias | Applicability Concerns | ||||||
---|---|---|---|---|---|---|---|
Patient Selection | Index Test | Reference Standard | Flow and Timing | Patient Selection | Index Test | Reference Standard | |
Abbas, 2016 [14] | High | Low | High | Low | Low | High | High |
Ballerini, 2012 [43] | High | Low | Low | Low | Low | Low | Low |
Chang, 2013 [15] | Low | Low | Low | Low | Low | Low | Low |
Cheng, 2011 [34] | Low | Low | Low | Low | High | High | Low |
Cheng, 2012 [36] | Low | Low | Low | Low | High | High | Low |
Cheng, 2013 [37] | Low | Low | Low | Low | High | High | Low |
Cheng, 2013 [40] | Unclear | Low | Low | Low | Low | Low | Low |
Choudhury, 2015 [24] | High | Low | High | Low | High | High | High |
Chuang, 2011 [33] | High | Low | Low | Low | Low | Low | Low |
Dorj, 2018 [25] | High | Low | High | Low | Low | High | High |
Esteva, 2017 [16] | High | Low | Low | Low | Low | Low | Low |
Ferris, 2015 [17] | Low | Low | Low | Low | Low | High | Low |
Fujisawa, 2018 [18] | High | Low | Low | Low | Low | High | Low |
Guvenc, 2013 [47] | High | High | Low | Low | High | High | Low |
Han, 2018 [23] | High | Low | Low | Low | Low | High | Low |
Immagulate, 2015 [31] | High | Low | High | Low | Low | High | High |
Kefel, 2012 [49] | High | Low | Low | Low | High | High | Low |
Kefel, 2016 [38] | Low | Low | Low | Low | Low | Low | Low |
Kharazmi, 2011 [48] | High | High | Low | Low | High | High | Low |
Kharazmi, 2016 [50] | High | High | Low | Low | High | High | Low |
Kharazmi, 2017 [51] | High | Low | Low | Low | Low | Low | Low |
Kharazmi, 2018 [52] | High | Low | Low | Low | Low | Low | Low |
Lee, 2018 [29] | High | Low | Low | Low | Low | High | Low |
Maurya, 2014 [19] | High | Low | High | Low | High | High | High |
Mishra, 2017 [39] | Unclear | Low | Low | Low | Low | Low | Low |
Møllersen, 2015 [30] | Low | Low | Low | Low | Low | Low | Low |
Shakya, 2012 [41] | High | High | Low | Low | High | High | Low |
Shimizu, 2014 [20] | High | Low | Low | Low | Low | High | Low |
Shoieb, 2016 [26] | High | Low | High | Low | Low | High | High |
Stoecker, 2009 [35] | Low | Low | Low | Low | Low | High | Low |
Sumithra, 2015 [21] | High | Low | High | Low | High | High | High |
Upadhyay, 2018 [27] | High | Low | Low | Low | Low | Low | Low |
Wahab, 2003 [32] | High | Low | High | Low | Low | High | High |
Wahba, 2017 [22] | High | Low | High | Low | Low | High | High |
Wahba, 2018 [42] | High | Low | Low | Low | Low | Low | Low |
Yap, 2018 [28] | High | Low | Low | Low | Low | High | Low |
Zhang, 2017 [44] | High | Low | Low | Low | Low | Low | Low |
Zhang, 2018 [45] | High | Low | Low | Low | Low | Low | Low |
Zhou, 2017 [46] | High | Low | Low | Low | Low | Low | Low |