Background
-
We design the NTSM model as a segmentation model for oral mucosal diseases. Our model joint action of the DA module and FHPA module achieves more accurate segmentation of non-salient lesions while minimizing the number of parameters, further inspiring the application of deep learning in the medical field.
-
We develop a DA module that can learn local, contextual, and semantic information by using convolutional neural networks. The DA helps to increase the differences between real lesions and backgrounds, promoting segmentation performance for non-salient feature regions.
-
The FHPA module we proposed realizes parameter sharing through depthwise separable convolution, effectively reducing the number of parameters and computations for high-precision large models, thereby decreasing the cost of model training and inference.
-
We conduct a series of comparative experiments on the private dataset oral mucosal diseases (OMD) and the public dataset international skin imaging collaboration (ISIC) to verify the effectiveness and innovation of our model. The experimental results demonstrate that our model not only enhances segmentation accuracy but also reduces the number of parameters.
Methods
Datasets
Non-salient target segmentation model (NTSM)
Difference association (DA) module
Feature hierarchy pyramid attention (FHPA) module
Loss function
Evaluation metrics
Results
Implementation details
Comparison with state-of-the-art methods
Datasets | OMD | |||
---|---|---|---|---|
Model | Sen (%)↑ | Spe (%)↑ | Dice (%)↑ | 95HD (mm)↓ |
U-Net (2015) [30] | 40.51 | 98.93 | 44.71 | 23.16 |
Attention UNet (2018) [31] | 57.33 | 99.06 | 63.54 | 15.36 |
nnU-Net (2021) [32] | 70.81 | 99.49 | 73.72 | 9.89 |
UNeXt (2022) [24] | 66.44 | 99.28 | 63.44 | 10.18 |
MALUNet (2022) [25] | 60.25 | 99.18 | 63.81 | 14.18 |
CRnet (2022) [15] | 60.47 | 99.25 | 65.59 | 15.08 |
EGE-UNet (2023) [26] | 67.97 | 99.01 | 65.61 | 13.96 |
TransAttUNet (2023) [33] | 62.03 | 98.01 | 67.77 | 13.78 |
NTSM (ours) | 71.00 | 99.56 | 76.86 | 8.88 |
Datasets | ISIC | |||
Model | Sen (%)↑ | Spe (%)↑ | Dice (%)↑ | 95HD (mm)↓ |
U-Net (2015) [30] | 93.81 | 99.38 | 78.15 | 1.96 |
Attention UNet (2018) [31] | 93.88 | 99.36 | 93.17 | 1.62 |
nnU-Net (2021) [32] | 95.95 | 99.09 | 93.40 | 1.24 |
UNeXt (2022) [24] | 87.77 | 99.15 | 89.77 | 1.42 |
MALUNet (2022) [25] | 92.24 | 99.55 | 93.52 | 1.24 |
CRnet (2022) [15] | 93.08 | 99.64 | 94.28 | 1.68 |
EGE-UNet (2023) [26] | 93.04 | 99.40 | 93.01 | 1.61 |
TransAttUNet (2023) [33] | 94.52 | 99.28 | 93.17 | 1.50 |
NTSM (ours) | 93.64 | 99.56 | 94.31 | 0.97 |
Model | Params (M)↓ | FLOPs (G)↓ | Memory (M)↓ | OMD-Dice (%)↑ | ISIC-Dice (%)↑ |
---|---|---|---|---|---|
FCN-8 (2017) [34] | 134.28 | 466.39 | 378.26 | 60.37 | 82.15 |
nnU-Net (2021) [32] | 126.56 | 466.23 | 353.69 | 73.72 | 93.40 |
Mask2Former (2022) [35] | 215.23 | 473.85 | 826.02 | 63.04 | 93.02 |
OneFormer (2023) [36] | 372.15 | 775.05 | 1500.10 | 66.78 | 93.46 |
NTSM (ours) | 71.88 | 421.72 | 253.64 | 76.86 | 94.31 |
Discussion
Ablation experiment (a): Verify the effectiveness of the modules in the non-salient target segmentation model (NTSM)
Backbone | Module | Sen (%)↑ | Spe (%)↑ | Dice (%)↑ | 95HD (mm)↓ | Params (M)↓ | FLOPs (G)↓ | Memory (M)↓ |
---|---|---|---|---|---|---|---|---|
U-Net | None | 40.51 | 95.93 | 44.71 | 23.16 | 13.39 | 124.17 | 51.17 |
DA | 55.38 | 99.31 | 62.55 | 17.76 | 13.40 | 124.70 | 51.22 | |
FHPA | 51.43 | 99.32 | 62.57 | 16.87 | 2.03 | 84.30 | 7.95 | |
D + F | 66.38 | 98.44 | 63.02 | 16.51 | 2.03 | 84.82 | 8.00 | |
nnU-Net | None | 70.81 | 99.50 | 73.72 | 9.89 | 126.56 | 466.23 | 353.69 |
DA | 71.62 | 99.44 | 75.43 | 9.10 | 126.58 | 473.00 | 353.86 | |
FHPA | 71.46 | 99.48 | 75.41 | 9.76 | 71.87 | 414.96 | 253.48 | |
D + F | 71.00 | 99.56 | 76.86 | 8.88 | 71.88 | 421.72 | 253.64 |
Ablation experiment (b): Verify the effectiveness of the submodule of the DA module
Backbone | Submodule | Sen (%)↑ | Spe (%)↑ | Dice (%)↑ | 95HD (mm)↓ |
---|---|---|---|---|---|
U-Net | None | 40.51 | 98.93 | 44.71 | 23.16 |
LCD | 54.25 | 98.79 | 55.91 | 20.56 | |
LSA | 47.04 | 99.24 | 57.30 | 19.57 | |
DA (LCD + LSA) | 55.38 | 99.31 | 62.55 | 17.76 | |
nnU-Net | None | 70.81 | 99.50 | 73.72 | 9.89 |
LCD | 71.70 | 99.52 | 75.23 | 9.52 | |
LSA | 71.18 | 99.45 | 74.94 | 9.78 | |
DA (LCD + LSA) | 71.62 | 99.44 | 75.43 | 9.10 |
Ablation experiment (c): Verify the effectiveness of the FHPA module
Backbone | Module | Dice (%)↑ | Params (M)↓ | FLOPs (G)↓ | Memory (M)↓ |
---|---|---|---|---|---|
U-Net | None | 44.71 | 13.39 | 124.17 | 51.17 |
FHPA (Enc 1) | 52.87 | 9.01 | 120.58 | 34.49 | |
FHPA (Enc 3) | 57.95 | 4.77 | 96.27 | 18.37 | |
FHPA (Enc 3 + Dec 1) | 62.57 | 2.03 | 84.30 | 7.95 | |
nnU-Net | None | 73.72 | 126.56 | 466.23 | 353.69 |
FHPA (Enc 1) | 74.28 | 108.35 | 465.12 | 320.35 | |
FHPA (Enc 3) | 75.11 | 71.93 | 446.14 | 253.67 | |
FHPA (Enc 3 + Dec 1) | 75.41 | 71.87 | 414.96 | 253.48 |
Backbone | Module | Dice (%)↑ | Params (M)↓ | FLOPs (G)↓ | Memory (M)↓ |
---|---|---|---|---|---|
U-Net | None | 44.71 | 13.39 | 124.17 | 51.17 |
DGA | 58.17 | 4.31 | 100.42 | 16.58 | |
FHPA | 62.57 | 2.03 | 84.30 | 7.95 | |
nnU-Net | None | 73.72 | 126.56 | 466.23 | 353.69 |
DGA | 74.87 | 77.91 | 434.09 | 276.41 | |
FHPA | 75.41 | 71.87 | 414.96 | 253.48 |