Introduction
Gastrointestinal stromal tumors (GISTs) are rare mesenchymal tumors of the gastrointestinal tract, with an estimated incidence between 10 and 15 cases per million persons per year [
1,
2]. The most common tumor locations are the stomach (56%) and the small intestine (32%) [
2]. Differentiating GISTs from other intra-abdominal tumors (non-GISTs) is highly important for early diagnosis and treatment planning [
3]. Due to the rarity of GISTs, establishing the correct diagnosis can be challenging. Computed tomography (CT) is the imaging modality of choice in GIST diagnosis [
4], but assessment through an invasive tissue biopsy is generally required [
5]. A non-invasive and quicker method may aid in the early assessment of GISTs, allowing rapid transfer of such patients to specialized treatment centers.
Treatment planning of GISTs is based on their molecular profile. The mitotic index (MI) reflects the proliferative rate of GISTs, correlates with survival and risk of metastatic spread [
6], and determines the use of adjuvant systemic treatment. Treatment decisions are also based on the GISTs’ mutational status.
PDGFRA exon 18 mutated (Asp842Val) GISTs are resistant to imatinib [
7]. GISTs with a
c-KIT exon 11 mutation have shown a greater sensitivity for imatinib than those with a
c-KIT exon 9 mutations [
3]. The MI and these genetic mutations are currently assessed through an invasive tissue biopsy. Prediction of such molecular characteristics based on imaging could guide treatment planning while awaiting the results of a final tissue biopsy.
Radiomics relates imaging features to molecular characteristics in order to contribute to diagnosis, prognosis, and treatment decisions. Radiomics has shown promising results in risk stratification of GISTs [
8‐
17], but has not been used to distinguish GISTs from non-GISTs nor to predict the molecular profile.
The aim of this study was to evaluate whether radiomics based on CT is capable of (1) differentiating GISTs from other intra-abdominal tumors resembling GISTs prior to treatment, i.e., the differential diagnosis and (2) predicting the presence and type of mutation (BRAF, PDGFRA, and c-KIT) and the MI of GISTs, i.e., the molecular analysis, also called “radiogenomics”.
Discussion
Radiomics can distinguish GISTs from other intra-abdominal tumors with a performance similar to three radiologists. Radiomics could not predict the presence and subtype of c-KIT mutations or the MI.
Diagnosing GISTs is currently done manually by radiologists and confirmed through a tissue biopsy [
4,
38,
39]. The ability to distinguish rare GISTs from non-GISTs on routine CT scans through radiomics could be a quick method for the initial assessment of intra-abdominal tumors. Radiomics could aid quick referral of GIST patients from a peripheral hospital to a center of expertise, shortening time to diagnosis by refining patient selection prior to biopsies, and prevent GISTs from being missed (i.e., false negatives), unnecessary referral, or even treatment for non-GISTs (i.e., false positives). To our knowledge, this is the first study to evaluate the GIST differential diagnosis on many locations through an automated radiomics approach on a large, multi-scanner dataset, and compare the performance of the model with that of the radiologists.
There were significant performance differences between the radiologists, and their agreement was poor, indicating high observer dependence. The advantages of the radiomics model are that it is automatic and observer independent, assuming the segmentation is reproducible as indicated by the high DSC and that it will always give the same prediction on the same image, thereby improving consistency over manual scoring.
In clinical practice, tumor location is highly relevant for distinguishing GISTs from non-GISTs, as GISTs grow typically in the stomach or small intestines [
2]. In our study, tumor location was based on radiology reports, which is subjective and occasionally fails to report the true tumor primary origin [
19]. Moreover, the tumor location distribution in our dataset may not be a correct representation of the overall population, e.g., only non-GISTs were located in the uterus. Despite the subjectivity of potential bias in tumor location, we added location to the imaging model for a fair comparison with the radiologists. Although this led to a higher AUC, a model based on location, e.g., simply classifying all lesions in the uterus as non-GISTs, is unfeasible and cannot be applied in the general population. The radiomics model rather predicts the likelihood of a lesion being a GIST purely based on the imaging appearance. Further research on location-matched datasets is required to investigate the value of location in the GIST differential diagnosis model.
In the literature, radiomics for risk classification or outcomes such as malignant potential or aggressive behavior for GISTs [
8‐
17] has mostly been based on criteria such as the Armed Forces Institute of Pathology criteria, modified National Institutes of Health consensus criteria of 2008, and the modified Fletcher classification system [
3,
40‐
44]. These studies illustrate the clinical need for new methods to stratify GISTs and show the potential of radiomics for GISTs. Our first contribution with respect to the existing literature is the focus on the diagnostic trajectory of GISTs, to simplify the diagnostic process of this rare tumor type by predicting the differential diagnosis. Existing studies mainly focus on risk classification, which has a less apparent direct application in clinical practice, and generally first require the GIST differential diagnosis to be applicable [
3,
40‐
43]. Second, our method determines the optimal radiomics pipeline from a large number of radiomics algorithms and parameters, automatically evaluating a large number of radiomics methods, whereas existing studies typically report the results of a “hand-crafted,” manually optimized radiomics pipeline [
8‐
17]. Moreover, through an extensive cross-validation scheme, all model optimization was performed on the training dataset, eliminating the risk of overfitting the model on the test set. Lastly, we evaluated the model’s robustness to segmentation and scanner variations.
Our model was not able to distinguish different genetic mutations or the MI of GISTs, which may be attributed to various factors. First, the dataset for the mutation analysis was relatively small (e.g., 90 patients in the MI analysis), which may have been too small for radiomics to learn from. Second, the use of different gene panels for the GIST mutational analysis over the years may have resulted in inaccuracies in the golden standard. Additionally, this might have led to a potential underestimation of mutation prevalence in the current cohort, as newer sequencing techniques use larger gene panels and have a higher sensitivity. Third, other (more complex or deep learning based) radiomics methods may be required to discover more intricate features. Lastly, the negative results may simply suggest that molecular characteristics such as a c-KIT mutation are too subtle to detect solely based on portal venous phase CT imaging characteristics. Other CT phases or modalities (e.g., magnetic resonance imaging) could provide more useful information.
Our study has several limitations. First, there was heterogeneity in the acquisition protocols. There were two acquisition parameters (KVP and slice thickness) with statistically significant differences between GISTs and non-GISTs, but their individual predictive power was low. Hence, although a minor positive bias due to heterogeneity in acquisition protocols cannot be completely ruled out, the predictive performance cannot be attributed to this bias alone. Alternatively, this heterogeneity may have also negatively affected the performance. Nevertheless, the radiomics model achieved a promising performance, similar to three experienced radiologists, suggesting high generalizability. Second, complete histologic data was only available for a subset of the patients. No data regarding the clinical outcome such as survival or recurrence was available for the GISTs. Finally, the current radiomics approach requires manual segmentation. While accurate, this process is also time-consuming and potentially subject to observer variability, although the DSC indicated good agreement. Only using features with a good or excellent reliability across the segmentations lowered the performance. This may indicate that there are features that have a low reliability but a high predictive power, thus resulting in low performance when removing these. Alternatively, it may indicate overfitting of the model to observer-dependent characteristics of the segmentation and thus exploitation of a bias in the segmentations. Automatic segmentation methods may help to overcome this limitation.
Future work should focus on the extension of the dataset, leading to more statistical power, potentially improving the performance as the model has more cases to learn from, and paving the way for more data-driven approaches such as deep learning. Also, this may result in sufficient samples to study the prediction of less common GIST mutations. Next, external validation of our findings on an independent, external dataset is required. Eventually, this may be followed by a prospective clinical trial with harmonized acquisition protocols in which the performance, as well as the cost-effectiveness, is assessed.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.