Introduction
Gliomas account for approximately 70% of malignant central nervous system (CNS) tumors (Ostrom et al.
2017). According to the 2016 World Health Organization (WHO) classification of CNS tumors, adult diffuse gliomas consist of astrocytic tumors, oligodendrogliomas, and glioblastomas (WHO grade II–IV) (Louis et al.
2016). Genomic characterization has demonstrated that identifications of mutations in isocitrate dehydrogenase (IDH) genes were associated with improved prognosis in patients with glioma (Cancer Genome Atlas Research et al.
2015; Hartmann et al.
2010; Parsons et al.
2008; Yan et al.
2009). The median overall survival of patients with IDH-mutated glioblastoma was 31 months compared to 15 months for those without the mutation (Yan et al.
2009). Patients diagnosed with IDH wild-type grade II–III glioma, which was molecularly and clinically similar to glioblastoma, had worse overall survival than those with IDH-mutated glioma of same grade (Cancer Genome Atlas Research et al.
2015). It thus seems that the identification of IDH genotype is important in the management of gliomas.
At present, the most commonly used method to assess IDH mutation status is molecular assay following biopsy or surgical resection. Although molecular assay can be informative, there are many factors that can limit its clinical use in evaluating treatment response and monitoring cancer progression (Rios Velazquez et al.
2017). These limiting factors include the lack of regular biopsies or surgical resections at the end of each treatment course, difficulty in evaluating the intra-tumor heterogeneity, inconvenient access of tumor samples, and failure to identify molecular genotype due to the poor quality of tumor tissues. In contrast to molecular assays, magnetic resonance imaging (MRI) is routinely used in the initial diagnosis and treatment response assessment of gliomas. Taking full advantage of the abundant information in these easily accessible images may provide an opportunity to overcome the limitations related to molecular assay.
MRI features have been used to predict the clinical outcomes and molecular subtypes including IDH genotype in glioma (Carrillo et al.
2012; Lee et al.
2015; Park et al.
2017). However, only a few imaging features have been used. Moreover, the identification of qualitative features is often inconsistent between observers. “Radiomics”, an emerging and promising field, hypothesizes that the advanced analysis of medical images can capture hundreds of additional features which are not currently used and may be valuable in personalized medicine (Lambin et al.
2012). Several studies have investigated the potential of these high-dimensional and minable radiomic features to noninvasively facilitate tumor detection, subtype classification, therapeutic response assessment, and prognosis prediction in multiple cancers (Aerts et al.
2014; Fehr et al.
2015; Huang et al.
2016a,
b; Li et al.
2016; Nie et al.
2016; Rios Velazquez et al.
2017; Zhang et al.
2017). For gliomas, radiomic features have also been applied to predict patient survival and molecular subtypes via machine learning methods (Macyszyn et al.
2016; Rathore et al.
2016; Yu et al.
2017; Zhang et al.
2017a,
b,
2018). However, the effectiveness of different radiomics-based machine learning approaches in IDH genotype prediction for patients with diffuse glioma is yet to be assessed.
Using radiomic features provided in the TCGA/TCIA repositories (Bakas et al.
2017a,
b), we evaluated and compared eight classical machine learning methods in terms of their stability and performance for noninvasive and pre-operative IDH genotype prediction. A total of 126 patients with diffuse glioma were enrolled for analyses. Feature selection and classification model training were performed using the training set. The predictive performance of the model was independently tested in the validation set. Our aim was to identify the optimal and reliable IDH genotype prediction methods with the full use of MRI images.
Discussion
Precision oncology refers to customizing cancer care for individual patients. Such individual customization can maximize the benefits of prevention and treatment interventions while minimizing adverse effects. The success of precision oncology relies on accurately categorizing patients on the basis of their prognostic characteristics and responses to a specific treatment. As quantitative features extracted from medical images can be used to enhance the understanding of tumor characteristics, some studies have explored the value of radiomic features in precision oncology (Aerts et al.
2014; Nie et al.
2016; Rios Velazquez et al.
2017). Rios Velazquez et al. illustrated that radiomics-based machine learning model can be used to predict EGFR mutation status, which is an important biomarker for the treatment of lung cancer (Rios Velazquez et al.
2017). Nie et al. showed the potential of radiomics-based machine learning model to predict pathologic responses after pre-operative chemoradiation therapy for locally advanced rectal cancer (Nie et al.
2016). Results from these studies suggested that highly accurate and reliable classification models can promote the success of radiomics in precision oncology. Furthermore, identifying the optimal machine learning methods is recommended for different clinical tasks (Lambin et al.
2017).
Parmar et al. compared 12 machine learning methods in terms of their prognostic performance and stability for overall survival (OS) prediction in patients with lung cancer. They identified Random Forest (AUC: 0.66 ± 0.03) as the method with the highest prognostic performance and high stability (Parmar et al.
2015). Zhang et al. evaluated nine classification methods in terms of their predictive performance for the prediction of local failure and distant failure in advanced nasopharyngeal carcinoma. Random Forest (AUC 0.85 ± 0.01) and adaptive boosting (AUC 0.82 ± 0.01) were found to have the highest prognostic performance (Zhang et al.
2017). In another study, Parmar et al. investigated 11 machine learning methods in terms of their performance for predicting OS in patients with head and neck cancer. Bayesian (AUC 0.67, RSD: 11.28), Random Forest (AUC 0.61, RSD 7.36), and Nearest Neighbor (AUC 0.62, RSD 10.52) displayed high prognostic performance and stability (Parmar et al.
2015a,
b). However, the optimal machine learning methods for IDH genotype predicting in patients with diffuse glioma have yet to be determined.
In the present study, we investigated and compared eight radiomics-based machine learning methods to pre-operatively predict IDH genotype for diffuse gliomas. As described in the previous study, MRI images used for feature extraction were collected from eight institutes, which may make the model broadly applicable in the clinical practice (Bakas et al.
2017c). Moreover, the features analyzed in this study were extracted from labels segmented through a semi-automatic approach (Bakas et al.
2017c), which can reduce the variation of delineation between different observers and produce more reproducible and stable features (Parmar et al.
2014). More stable features may result in a more reliable classification model.
As over 700 quantitative radiomic features were analyzed in the current study, feature selection was performed. Feature selection is an exceedingly helpful strategy in data mining. It can help simplify the model, avoid the curse of dimensionality, and reduce over-fitting. MRMR was applied for feature selection in our analyses (Ding and Peng
2005). Finally, the top 5, 10, 15, 20, 25, 30, 35 and 40 selected features were used to train the classifiers separately. The accuracy, AUC, and RSD were quantified to evaluate the predictive performance and stability of eight classical machine learning methods. The average performance over all classifiers was highest when the top 15 selected features were used, and RF (AUC 0.931 ± 0.036, accuracy 0.885 ± 0.041, RSD 3.867) had the highest predictive performance and stability. As one of the most frequently used machine learning algorithms in clinical classification problems (Parmar et al.
2015; Rios Velazquez et al.
2017; Zhang et al.
2017a,
b), RF reduces over-fitting by bootstrap sampling and randomly selecting features at each split in the process of training (Liaw and Wiener
2002). The results from our analyses suggest that RF should be preferred with regard to predicting IDH genotype for patients with gliomas.
In the current study, we also evaluated and compared the predictive value of different feature subcategories. It has been shown that volume features were associated with molecular subtypes in glioma. Park et al. have shown that the IDH-mutant group had a smaller proportion of enhancing tumors in grade II and III gliomas (Park et al.
2017). Carrillo et al. demonstrated that the presence of non-contrast enhancing tumor was related to IDH mutation in grade IV gliomas (Carrillo et al.
2012). Textural features, which can quantify the intra-tumor heterogeneity by evaluating the gray-level intensity and position of the pixels within an image (Castellano et al.
2004; O’Connor et al.
2015), have also been applied to predict MGMT methylation status (Xi et al.
2017), EGFR expression (Li et al.
2018), and immune cell infiltration status (Narang et al.
2017) for gliomas. In our analysis, the feature subcategories volume, NET-hit, ET-hit, and ED-hit had high predictive performance (Fig.
4). Some of the patients with diffuse glioma did not demonstrate enhancement or edema on MRI scans. The models based on volume features or NET-hit features provide an opportunity to noninvasively and pre-operatively predict IDH genotype for these patients.
This study has some limitations. First, larger sample sizes and external validation are required to assess the generalizability of our model. In the current study, we repeated the training procedure 20 times; each time different patients were assigned to the training and the validation set. Furthermore, the model predictive performance was repeatedly and independently evaluated in the validation set. These approaches enabled a proper estimation of our model generalizability. Second, recent studies have illustrated that diffusion-weighted imaging and magnetic resonance spectroscopy (MRS) have the potential to noninvasively identify IDH genotype for gliomas (Branzoli et al.
2017; Choi et al.
2016; Lee et al.
2015; Zhang et al.
2017a,
b). The addition of imaging features from these modalities may improve the classification performance of our model. Third, the underlying biological mechanisms of how these features are correlated with IDH genotype in diffuse gliomas are presently unclear. Further research is needed to explore these mechanisms.
In summary, the role of MRI radiomic features in IDH genotype predicting and eight radiomics-based machine learning methods was compared and investigated in the present study. RF with top 15 selected features showed the highest predictive performance and stability (accuracy 0.885 ± 0.041, AUC 0.931 ± 0.036, RSD 3.87%). These radiomics-based models maximized the value of the information contained in the medical images. Identification of an optimal radiomics-based machine learning method to noninvasively and pre-operatively predict IDH genotype can be valuable in the initial diagnostic evaluation and treatment planning for patients with diffuse glioma.