Background
Methods
Inclusion criteria and exclusion criteria
Information sources and search
Database | Keywords | Results | Date |
---|---|---|---|
Medline via PubMed | ((artificial intelligence [MeSH]) OR “artificial intelligence” OR (machine learning [MeSH]) OR “machine learning” OR (deep learning [MeSH]) OR “deep learning” OR “neural network” OR “computer vision”) AND (“oral cancer” OR “oral squamous cell carcinoma” OR “oral potentially malignant disorder” OR “oral precancerous” OR (mouth neoplasms [MeSH])) | 316 | 14 June 2023 |
Google Scholar | allintitle:(“artificial intelligence” OR “machine learning” OR “deep learning” OR “neural network” OR “computer vision”) AND (“oral cancer” OR “oral squamous cell carcinoma” OR “oral potentially malignant disorders” OR “oral precancerous”) | 112 | 14 June 2023 |
Scopus | (“artificial intelligence” OR “machine learning” OR “deep learning” OR “neural network” OR “computer vision”) AND (“oral cancer” OR “oral squamous cell carcinoma” OR “oral potentially malignant disorder” OR “oral precancerous”) | 477 | 14 June 2023 |
Study selection
Data collection and extraction
Risk of bias and applicability
Statistical analysis
Results
Study selection and study characteristics
Characteristics of relevant studies
No | Author, Year (Ref) | Country | Data Modality (type of data) | Dataset Size (Train/ Valid/Test) | Labeling Procedure | Augmentations | Deep learning algorithms | Hyperparameters | Hardware | Performance measures | Outcome |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | Aubreville M. et al., 2017 [19] | Germany | Confocal Laser Endomicroscopy images (OSCC) | 7894 images | N/A | arbitrarily, randomly rotated copies | LeNet-5 with Transfer learning | 3000 epochs Learning rate = 0.01 Optimizer: Adam | N/A | Accuracy Sensitivity Specificity AUC | 88.3% 0.87 0.9 0.96 |
2 | Ariji Y. et al., 2018 [20] | Japan | CT image of cervical lymph node (OSCC) | 441 images | Annotated by a radiologist | altering the brightness, contrast, rotation, and sharpness | AlexNet | 150 epochs | Nvidia GeForce GTX GPU workstation (Nvidea Corp., Santa Clara, CA, USA) with 11GB of memory | Accuracy Sensitivity Specificity PPV NPV AUC | 78.2% 0.75 0.81 79.9% 77.1% 0.80 |
3 | Xu S. et al., 2019 [21] | China | CT images (Oral cancer) | 7000 images | Annotated by oral oncologist and a radiologist. | translational rotation and mirroring | LeNet-5 | Learning rate = 0.1–0.01 | N/A | Accuracy Sensitivity Specificity AUC | 75.4% 0.82 0.74 79.6% |
4 | Ariji Y. et al., 2019 [22] | Japan | CT images (OSCC) | 703 images (80% training and 20% test dataset) | Annotated by a radiologist | N/A | AlexNet | 300 epochs | GeForce GTX 1080 Ti, NVIDIA with 11 GB of GPU, 128 GB of memory, and the open-source operating system Ubuntu OS v. 16.04.2 | Accuracy Sensitivity Specificity PPV NPV | 84.0% 0.67 0.9 69.2% 89.0% |
5 | Panigrahi S., Swarnkar T., 2019 [23] | India | Histopathological images (Malignant, benign) | 386 images | N/A | rotating, inverting, and flipping | CNN | 100 epochs | Ubuntu 16.04 and accelerated by a graphicprocessing unit (NVIDIA GeForce 6GTX 1080 T7i with 4X 32 GB of memory) | Accuracy | 96.8% |
6 | Jeyaraj P.R. et al., 2019 [24] | India | Hyperspectral images (Oral cancer) | 2400 images | N/A | N/A | ResNet | Momentum rate = 0.1 Learning rate = 0.5 Dropout rate = 0.25 Batch size = 75 | Intel Xeon processors, 5.2 GHz and a GPU - NVIDIA series | Accuracy Sensitivity Specificity | 94.8% 0.98 0.97 |
7 | Kiruthika S., Rahmath Nisha S., 2020 [25] | India | Histopathological images (OSCC) | 1224 images | N/A | N/A | CNN | N/A | N/A | Sensitivity Specificity Precision Recall | 0.99 0.94 94.6% 99.5% |
8 | Ramalingam A. et al., 2020 [26] | India | Histopathological images (OSCC) | 350 images (275 training, 75, and testing images) | N/A | N/A | - Inception- v3 - ResNet50 | N/A | N/A | Accuracy | 92.1% |
9 | Chinnaiyan R. et al., 2020 [27] | India | Histopathological images (OSCC) | 696 images | N/A | N/A | CNN with Transfer learning | 5 or more epochs | N/A | Precision Recall F1-score | 92.0% 89.0% 91.0% |
10 | Heidari A. E. et al., 2020 [28] | USA | Optical coherence tomography (OSCC) | 54 images (33 training, 21 validation, and test images)) | N/A | N/A | AlexNet | 120 iterations | GPU (Nvida GTX 1080), | Sensitivity Specificity | 1.0 0.7 |
11 | Das N. et al., 2020 [29] | India | Histopathological images (OSCC) | 156 images | N/A | rotating, shearing, translation, zooming and flipping | - AlexNet, - VGG-16 - VGG-19 - Resnet-50 - CNN | 50 epochs Learning rate = 0.0001 Optimizer: Adam | GPU based system under Linux operating system having Intel®Corei7® 8750 h processor with 16GB memory and GTX® 1060 graphics | Accuracy | 96.6% |
12 | Fu Q. et al., 2020 [30] | China | Clinical oral images (OSCC) | 6176 images (5775 training, and 401 validation images) | N/A | scaling, rotation, horizontal flipping and adjustment of the saturation and exposure | Deep learning algorithm | N/A | N/A | Sensitivity Specificity AUC | 0.95 0.89 0.98 |
13 | Musulin J. et al., 2021 [31] | Croatia | Histopathological images (OSCC) | 322 images | N/A | horizontal flip, vertical flip, rotation | - InceptionV3 - InceptionResNetv2 - DenseNet201 - NASNet EfficientNetB3 | Learning rate = 0.001–0.0001 | N/A | AUC | 0.95 |
14 | Alosaimi W. et al., 2021 [32] | Saudi Arabia | Histopathological images (OSCC) | 1224 images | N/A | scaling, cropping, flipping, padding, rotation, translation, affine transformation, brightness, contrast and saturation | - LeNet-5 - AlexNet - VGG - Inception - ResNet50 | 10,000 iterations Learning rate = 0.001 Batch size = 64 | N/A | Precision Recall F1-score Accuracy | 98.0% 99.0% 98.0% 98.0% |
15 | Tomita H. et al., 2021 [33] | Japan | CT images (OSCC) | 320 images (224 training, 32 validation, and 64 test images)) | N/A | horizontal flip, vertical flip, width shift, and height shift. | - Deep learning | N/A | N/A | Accuracy Sensitivity Specificity | 90.9% 0.73 1.0 |
16 | Carmalan S. et al., 2021 [34] | USA | Clinical oral images (OPMDs) | 54 images (85:15 for Training and validation) | Annotated by clinical team members | horizontal flip, vertical flip | Transfer - learning on Inception-ResNet-V2 | 20 epochs Learning rate = 0.0003 Batch size = 64 | N/A | Precision Recall F1-score Accuracy | 99.3% 100.0% 97.9% 90.9% |
17 | Musulin J. et al., 2021 [35] | Croatia | Histopathological images (OSCC) | 322 histology images | N/A | rotation, horizontal flip and vertical flip | - ResNet50 - ResNet101 - Xception - MobileNetv2 | Learning rate = 0.001–0.000001 Optimizer: Bayesian | two Intel Xeon Gold CPUs (24 C/48 T, at 2.4 GHz), 768 GB of ECC DDR4 RAM, and five Nvidia Quadro RTX 6000 GPUs, with 24 GB of RAM, 4608 CUDA and 576 Tensor cores. | AUCmacro AUCmicro | 0.96 0.03 |
18 | Warin K. et al., 2021 [36] | Thailand | Clinical oral images (OSCC) | 700 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | scaling, rotation, horizontal flipping, and adjustment of the saturation and exposure | - DenseNet121 | N/A | 2 of GPU, TitanXP 12GB, Nvidia Driver: 450.102 and CUDA: 11.0. | Precision Recall F1-score Sensitivity Specificity AUC | 100.0% 99.0% 99.0% 0.99 1.0 0.99 |
19 | Kavyashree C. et al., 2022 [37] | India | Histopathological images (OSCC) | 526 images (70:15:15 for training, validation, and testing) | N/A | N/A | - CNN - DenseNet201 - DenseNet121 - DenseNet169 | 50 epochs Learning rate = 0.0001 Loss function: Binary Crossentropy | N/A | Precision Recall F1-score Accuracy TPR FPR | 98.9% 98.9% 93.2% 85.0% 0.93 0.14 |
20 | Arujuaid A. et al., 2022 [38] | USA | Histopathological images (OSCC) | 448 images | Annotated by oral pathologists | N/A | - GoogLeNet - InceptionV3 - Transfer learning | N/A | N/A | Precision Recall F1-score Accuracy | 90.0% 95.5% 92.8% 92.5% |
21 | Krishna S. et al., 2022 [39] | India | Histopathological images (OSCC) | 1224 images | N/A | N/A | - CNN - VGG16 - ResNet50 - Ensemble - Learning (VGG16+ ResNet50) | N/A | N/A | Accuracy | 62.50% |
22 | Sharma D. et al., 2022 [40] | India | Clinical oral images (OSCC) | 329 images (70:10:20 for training, validation, and test) | N/A | flipping, zooming, and rotation | - VGG19 - VGG16 - MobileNet - InceptionV3 - ResNet50 | 50 epochs Batch size = 16 Learning rate = 0.001 | Tesla 1xK80 graphics card | Precision Recall F1-score Accuracy | 60.0% 43.0% 50.0% 76.0% |
23 | Shetty SK. et al., 2022 [41] | India | Histopathological images (OSCC) | 1224 images (70:30 for training, and test) | N/A | N/A | - VGG16 - Inception V3 - ResNet50 - duck pack optimization with deep transfer learning | N/A | Intel Core i5 processor and 8 GB of RAM | Precision Recall F1-score Accuracy | 95.5% 97.5% 96.4% 97.3% |
24 | Jubair F. et al., 2022 [42] | Jordan | Clinical oral images (OSCC, OPMDs) | 716 images (79:7:14 for training, validation, and test) | N/A | N/A | - EfficientNet-B0 - VGG19 - ResNet101 | Batch size = 32 Learning rate = 0.0001 Optimizer: Adam | N/A | Accuracy Sensitivity Specificity AUC | 85.0% 0.87 0.85 0.93 |
25 | Warin K. et al., 2022 [43] | Thailand | Clinical oral images (OPMDs) | 600 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | N/A | - DenseNet-121 - ResNet-50 | 100 epochs Batch size = 32 Learning rate = 0.00001 | Tesla P100, Nvidia driver: 460.32 and CUDA: 11.2 (Nvidia Corporation, CA, USA) | Precision Recall F1-score Sensitivity Specificity AUC | 92.0% 98.0% 95.0% 0.98 0.92 0.95 |
26 | Xu Z. et al., 2022 [44] | China | Histopathological images (OSCC) | 757 images | N/A | N/A | - EfficientNet b0 - ShuffleNetV2 - ResNeXt_18 | 80 epochs Batch size = 80 Learning rate = 0.0005 Optimizer: Adam | Four NVIDIA Tesla K80 graphics cards | Accuracy AUC | 98.1% 0.99 |
27 | Fati S. M. et al., 2022 [45] | Saudi Arabia | Histopathological images (OSCC) | 5192 images | N/A | multiangle rotation, flipping and shifting | - AlexNet - ResNet-18 | 28 and 33 epochs | N/A | Precision Recall Accuracy Sensitivity Specificity AUC | 99.7% 99.0% 99.1% 0.99 0.99 0.99 |
28 | Warin K. et al., 2022 [46] | Thailand | Clinical oral images (OSCC, OPMDs) | 980 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | N/A | - DenseNet-169 - ResNet-101 - SqueezeNet - Swin-S | 43, 100 epochs Batch size = 16, 32 Learning rate = 0.00001 | Tesla P100, Nvidia driver: 460.32 and CUDA: 11.2 (Nvidia Corporation, CA, USA) | Precision Recall F1-score Sensitivity Specificity AUC | 98.0% 99.0% 98.0% 0.99 0.99 1.0 |
29 | Deif M. A. et al., 2022 [47] | Egypt | Histopathological images (OSCC) | 1224 images (80:20 for training, and test) | N/A | N/A | - VGG16 - AlexNet - ResNet50 - Inception V3 | Batch size = 32 Learning rate = 0.001 | N/A | Precision Accuracy Sensitivity | 96.3% 96.3% 0.99 |
30 | Yuan W. et al., 2022 [48] | China | Optical Coherence Tomography images (OSCC) | 468 images (346 training, and 122 test images) | Annotated by two senior dental specialists with professional diagnoses | N/A | Multi-Level - Deep Residual Learning | 20 epochs | Nvidia Geforce 2080Ti | Accuracy Sensitivity Specificity PPV NPV AUC | 87.5% 0.91 0.88 85.3% 90.2% 0.92 |
31 | Yang S.Y. et al., 2022 [49] | China | Histopathological images (OSCC) | 2025 images (1925 training, and 100 test images) | Annotated by senior pathologists | N/A | - Deep learning | 80, 100 epochs Batch size = 64 Learning rate = 0.001 Optimizer: Adam Loss function: cross entropy | NVIDIA RTX 2080Ti (Abadi 2016) | Sensitivity Specificity F1-score PPV NPV | 0.98 0.92 95.1% 82.4% 97.8% |
32 | Chang X. et al., 2023 [50] | China | Raman spectroscopy (OSCC) | 16,200 Raman spectra | N/A | N/A | - AlexNet - VGGNet - ResNet50 - MobileNetV2 - Transformer | Batch size = 64 Learning rate = 0.0001 Optimizer: Adam | NVIDIA GeForce GTX 1080 Ti | Precision Recall Accuracy | 92.3% 92.9% 92.8% |
33 | Afify HM. et al., 2023 [51] | Egypt | Histopathological images (OSCC) | 1224 images | N/A | random, reflection, translation, resizing and rotation | - ResNet-101 - GoogleNet - SqueezeNet - ShuffleNet - AlexNet - DenseNet-201 - InceptionResNet-V2 - EfficientNet-b0 - VGG-19 - NasNetMobile with transfer learning methods | 100 epochs Batch size = 15 Learning rate = 0.001 5200 and 5900 iterations | N/A | Precision Recall F1-score Accuracy Sensitivity Specificity | 100.0% 100.0% 100.0% 100.0% 1.0 1.0 |
34 | Agarwal P. et al., 2023 [52] | India | CT images (OSCC) | 1755 images | Annotated by radiologists | horizontal flip, vertical flip shear and zoom | - BID-Net - VGG16 - VGG19 - ResNet-50 - MobileNetV2 - DenseNet-121 - ResNet-101 | 28 epochs Batch size = 15 Learning rate = 0.01, 0.001, 0.001 and 0.000 1 | N/A | Precision Recall F1-score Accuracy AUC | 91.0% 95.2% 92.6% 93.6% 95.9% |
35 | Oya K. et al., 2023 [53] | Japan | Histopathological images (OSCC) | 90,059 images | N/A | horizontal flip, vertical flip, hue, saturation, contrast, brightness, cropping, rotation, zoom, and shift | EfficientNet B0 | N/A | N/A | Precision Recall Accuracy | 97.83% 98.36% 99.65% |
36 | Das M. et al., 2023 [54] | India | Histopathological images (OSCC) | 1224 images (75:25 for training, and test) | N/A | Rotation, shift, zooming and shirring | - 10-layer CNN - VGG16 - VGG19 - Alexnet - ResNet50 - ResNet101 - Mobile Net - Inception Net | 10, 50, 100 epochs Activation Function: ReLU Optimizer: Adam | N/A | Precision Recall F1-score Sensitivity Specificity Accuracy AUC Error rate | 97.0% 98.0% 97.0% 0.98 0.97 97.0% 0.97 0.03 |
37 | Flügge T. et al., 2023 [55] | Germany | Clinical oral images (OSCC) | 1406 images (1124 training, 141 validation, and 141 test images) | N/A | N/A | Swin-Transformer | Learning rate = 0.005 Momentum = 0.9 Weight decay = 0.0001 | 12 GB NVIDIA TITAN V GPU | Accuracy F1-score Sensitivity Specificity PPV NPV | 98.6% 98.6% 0.99 0.99 98.6% 98.6% |
38 | Ananthakrishnan B. et al., 2023 [56] | India | Histopathological images (OSCC) | 1224 images | N/A | random rotation, translation and sheer | - ResNet50 - ResNet101 - ResNet152 - ResNet50V2 - ResNet101V2 - ResNet152V2 - Xception - VGG16 - VGG19 - InceptionV3 - InceptionResNetV2 - DenseNet201 - DenseNet121 - DenseNet169 | N/A | NVIDIA Tesla K80 | Sensitivity Specificity Accuracy AUC | 99.3% 100.0% 99.7% 0.99 |
39 | Panigrahi S. et al., 2023 [57] | India | Histopathological images (OSCC) | 4000 images (2800 training, 400 validation, and 800 test images) | Annotated by pathologist | flipping, inverting, scaling, and rotation | - VGG16 - VGG19 - ResNet50 - InceptionV3 - MobileNet | Batch size = 32 Learning rate = 0.005 Momentum = 0.9 Weight decay = 0.0005 Optimizer: Adam | System (Quadro P5200) with a six-core i7 processor, 32 GB of GDDR5 RAM, and NVIDIA-2560 CUDA processing cores, 16 GB GPU (32 GB GDDR5 graphics memory and 2560 CUDA cores) | Precision Recall F1-score Accuracy | 97.0% 96.0% 96.0% 96.6% |
40 | Yang Z. et al., 2023 [58] | China | Histopathological images (OSCC) | 13,799 images (9737 training, and 4062 test images) | N/A | N/A | - LeNet-5 - VGG16 - ResNet18 | 40 epochs Batch size = 32 Learning rate = 0.0001 Momentum = 0.9 Optimizer: Adam | N/A | Precision Sensitivity Specificity Accuracy AUC | 94.5% 99.5% 97.3% 96.8% 0.99 |
No | Author, Year (Ref) | Country | Data Modality (type of data) | Dataset Size (Train/ Valid/Test) | Labeling Procedure | Augmentations | Deep learning algorithms | Hyperparameters | Hardware | Performance measures | Outcome |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | Ariji Y. et al., 2020 [59] | Japan | CT images (OSCC) | 365 images (Training: 260 images Validation: 60 images Test: 45 images) | Annotated by a radiologist | N/A | DetectNet | 1000 epochs Learning rate 0.0001 Optimizer: Adam | graphic cards (GeForce GTX 1080 Ti, NVIDIA) with 11 GB of GPU and the opensource operating system Ubuntu OS v. 16.04.2. | Precision Recall F1-score | 96.4% 73.0% 83.1% |
2 | Warin K. et al., 2021 [36] | Thailand | Clinical oral images (OSCC) | 700 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | scaling, rotation, horizontal flipping, and adjustment of the saturation and exposure | Faster R-CNN | N/A | 2 of GPU, TitanXP 12GB, Nvidia Driver: 450.102 and CUDA: 11.0. | Precision Recall F1-score AUC | 76.7% 82.1% 79.3% 0.79 |
3 | Warin K. et al., 2022 [43] | Thailand | Clinical oral images (OPMDs) | 600 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | N/A | - Faster R-CNN - YOLOv4 | 100 epochs Batch size = 32 Learning rate = 0.00001 | 2 of GPU, TitanXP 12GB, Nvidia Driver: 450.102 and CUDA: 11.0. | Precision Recall F1-score AUC | 79.7% 81.0% 80.3% 0.74 |
4 | Warin K. et al., 2022 [46] | Thailand | Clinical oral images (OSCC, OPMDs) | 980 images (70:10:20 for training, validation, and test) | Annotated by three oral and maxillofacial surgeons | N/A | - Faster R-CNN - YOLOv5 - RetinaNet - CenterNet2 | 1882 epochs Batch size = 8, 128 Learning rate = 0.001 15,000 and 20,000 iterations | Tesla P100, Nvidia driver: 460.32 and CUDA: 11.2 (Nvidia Corporation, CA, USA) | Precision Recall F1-score AUC | 98.0% 92.0% 89.0% 0.91 |
5 | Xu X. et al., 2023 [60] | China | CT images (OSCC) | 5412 images (60:30:10 for training, validation, and testing) | Annotated by a radiologist | N/A | - Mask R-CNN | 10, 50, 100 epochs | NVIDIA V100 GPU | AP50 | 72.5% |
No | Author, Year (Ref) | Country | Data Modality (type of data) | Dataset Size (Train/ Valid/Test) | Labeling Procedure | Augmentations | Deep learning algorithms | Hyperparameter | Hardware | Performance measures | Outcome |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | Das D.K., et al., 2019 [61] | India | Histopathological images (OSCC) | 252 images (70:30 for training, and test) | N/A | N/A | CNN | 50 epochs Learning rate = 0.01 Batch size 16 | N/A | Dice index Jaccard index Precision Recall | 94.2% 89.47% 97.6% 91.6% |
2 | Fraz M.M. et al., 2020 [62] | UK | Histopathological images (OSCC) | 7780 images (5522 training, 1512 validation, and 756 test images) | Annotated by a pathologist | N/A | - FCN-8 - U-Net - Segnet - DeepLabV3+ - FABnet | 50 epochs 45,000 iterations Learning rate = 0.0001 Batch size = 6 | Nvidia GTX 1080Ti GPUs | Jaccard Index Dice index Accuracy Sensitivity Specificity Precision | 78.4% 87.9% 96.3% 0.87 0.98 89.0% |
3 | Martino F. et al., 2020 [63] | Italy | Histopathological images (Oral cancer) | 288 images (180 training, 100 validation, and 100 test image) | N/A | flipping the images vertically, horizontally, and in both ways | - SegNet. - U-Net - U-Net with VGG16 encoder. - U-Net with ResNet50 encoder | 60 epochs Learning rate = 0.0001 Loss function: Cross-Entropy function | N/A | mIoU | 0.63 |
4 | Dos S. et al., 2021 [64] | Brazil | Histopathological images (OSCC) | 1050 images (840 training, and 210 test image) | Annotated by a pathologist | horizontal/vertical flip, rotation, elastic transformation, grid distortion and optical distortion | Fully convolutional network | 500 epochs Learning rate = 0.001 Batch size 16 Optimizer: Adam | Intel Core i7 3.4 GHz × 8 processor, 32 GB memory, 1 TB SSD) equipped with GeForce GTX 1050 Ti graphic card and Ubuntu 20.04 operational system | Accuracy Sensitivity Specificity F1 score Jaccard Index | 97.6% 0.93 0.98 92.0% 85.2% |
5 | Paderno A. et al., 2021 [65] | Italy | Endoscopic videos (OSCC) | 226 frames | Annotated by an expert clinician | rotation, shift, zoom, horizontal and vertical flip | - U-Net - U-Net 3 - ResNet | N/A | N/A | Dice index | 76.0% |
6 | Musulin J. et al., 2021 [35] | Croatia | Histopathological images (OSCC) | 322 histology images | N/A | Rotation, horizontal flip and vertical flip | DeepLabv3+ with Xception_65 | Learning rate = 0.001–0.000001 Optimizer: Bayesian | two Intel Xeon Gold CPUs (24 C/48 T, at 2.4 GHz), 768 GB of ECC DDR4 RAM, and five Nvidia Quadro RTX 6000 GPUs, with 24 GB of RAM, 4608 CUDA and 576 Tensor cores. | mIoU F1 score | 0.88 95.5% |
7 | Pennisi A. et al., 2022 [66] | Belgium | Histopathological images (OSCC) | 389 WSI samples | Annotated by two pathologists | N/A | U-Net | N/A | N/A | Accuracy Dice index mIoU | 82.0% 82.0% 0.72 |
8 | Ariji Y. et al., 2022 [67] | Japan | CT images (OSCC) | 983 images (834 training, 77 validation and 72 test image) | N/A | N/A | U-net | 200 epochs Learning rate = 0.001 | 11 GB GPU (NVIDIA GeForce RTX 2080 Ti, NVIDIA, Santa Clara, CA, USA) and 32 GB of memory. | Precision Recall F1 score AUC | 94.2% 74.2% 83.1% 95.0% |
9 | Liu Y. et al., 2022 [68] | USA | Histopathological images (Oral precancerous lesion) | 39,264 images | Annotated by 112 pathologists | rotation, horizontal and vertical flips | - DeepLabv3+ - Unet++ | 20 epochs | Nvidia Titan GPUs | Accuracy Precision F1 score Sensitivity | 90.9% 90.3% 93.3% 0.97 |
10 | Dos S. et al. 2023 [69] (32) | Brazil | Histopathological images (OSCC) | 200 histology images (100 training, and 100 test image) | N/A | rotation, transpose, and horizontal and vertical axis flipping | - Fully convolutional networks | 400 epochs | Intel Core i7 3.4GHz × 8 processor, 32 GB memory, 1 TB SSD equipped with GeForce GTX 1050 Ti graphic card and Ubuntu 20.04 operational system | Accuracy Precision F1 score Sensitivity Specificity IoU | 86.46% 76.63% 77.16% 0.81 0.91 0.63 |
No | Author, Year (Ref) | Country | Data Modality (Type of data) | Dataset Size | Inclusion Criteria (if any) | Exclusion Criteria (if any) | Hyperparameter | Augmentations | Deep learning algorithms | Hardware | Performance measures | Outcome |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Kim D.W. et al., 2019 [70] | Republic of Korea | Clinicopathological data (OSCC) | 255 patients’ records | N/A | patients with metastatic disease, secondary primary cancer, perioperative mortality, a history of previous radiotherapy or/and chemotherapy, or a history of previous head and neck cancer Patients with a follow-up period shorter than 36 months | N/A | N/A | - DeepSurv - Random survival forest (RSF) - Cox proportional hazard model (CPH) | N/A | c-index | 0.78 |
2 | Adeoye J. et al., 2021 [71] | Hong Kong | Clinicopathological and treatment data (OPMDs) | 1098 patients’ records | minimum follow-up of 18 months | patients with synchronous erythroplakia and proliferative verrucous leukoplakia or those with previous oral cavity cancers | Batch size = 64, 128, 256 Drop out = 0.1–0.3 Nodes per layer = 32, 64, 128, 256 Optimizer: Adam Activation: ReLU | N/A | - DeepSurv - Neural net-extended time-dependent cox model (Cox-Time) DeepHit - RSF | N/A | c-index IBS | 0.95 0.04 |
3 | Adeoye J. et al., 2022 [72] | Hong Kong | Clinicopathological and treatment data (Oral cancer) | 313 patients’ records | minimum follow-up period of 12 months | cases with carcinoma-in-situ, oral cancers with non-squamous histology, recurrent oral cavity tumors at first encounter, and patients with inoperable disease | Learning rate = 0.01, 0.001 Batch size = 64 Drop out = 0.4 Nodes per layer = 64 | N/A | - DeepSurv - DeepHit - Cox-Time - RSF | N/A | c-index IBS | 0.85 0.12 |