16.10.2024 | Abdominal Radiology
One novel transfer learning-based CLIP model combined with self-attention mechanism for differentiating the tumor-stroma ratio in pancreatic ductal adenocarcinoma
verfasst von:
Hongfan Liao, Jiang Yuan, Chunhua Liu, Jiao Zhang, Yaying Yang, Hongwei Liang, Haotian Liu, Shanxiong Chen, Yongmei Li
Erschienen in:
La radiologia medica
|
Ausgabe 11/2024
Einloggen, um Zugang zu erhalten
Abstract
Purpose
To develop a contrastive language-image pretraining (CLIP) model based on transfer learning and combined with self-attention mechanism to predict the tumor-stroma ratio (TSR) in pancreatic ductal adenocarcinoma on preoperative enhanced CT images, in order to understand the biological characteristics of tumors for risk stratification and guiding feature fusion during artificial intelligence-based model representation.
Material and methods
This retrospective study collected a total of 207 PDAC patients from three hospitals. TSR assessments were performed on surgical specimens by pathologists and divided into high TSR and low TSR groups. This study developed one novel CLIP-adapter model that integrates the CLIP paradigm with a self-attention mechanism for better utilizing features from multi-phase imaging, thereby enhancing the accuracy and reliability of tumor-stroma ratio predictions. Additionally, clinical variables, traditional radiomics model and deep learning models (ResNet50, ResNet101, ViT_Base_32, ViT_Base_16) were constructed for comparison.
Results
The models showed significant efficacy in predicting TSR in PDAC. The performance of the CLIP-adapter model based on multi-phase feature fusion was superior to that based on any single phase (arterial or venous phase). The CLIP-adapter model outperformed traditional radiomics models and deep learning models, with CLIP-adapter_ViT_Base_32 performing the best, achieving the highest AUC (0.978) and accuracy (0.921) in the test set. Kaplan–Meier survival analysis showed longer overall survival in patients with low TSR compared to those with high TSR.
Conclusion
The CLIP-adapter model designed in this study provides a safe and accurate method for predicting the TSR in PDAC. The feature fusion module based on multi-modal (image and text) and multi-phase (arterial and venous phase) significantly improves model performance.