Introduction
Thyroid carcinoma is the fifth most common cancer among women in the United States [
1]. In recent decades, the incidence of thyroid carcinoma has increased dramatically in many countries [
2]. A study of the United States analyzed the 10-year data from 2007 to 2016, and reported that the incidence of thyroid carcinoma among young people of all ages (15–39 years old) ranked the top three [
3]. Among thyroid carcinoma, anaplastic thyroid carcinoma (ATC) accounts for 1-2% [
4], but it is the most aggressive type and highly malignant, which is the main cause of death associated with thyroid malignant tumors. The median survival time of ATC is only 5–6 months [
5]. The quality of life among ATC patients is significantly reduced, coupled with persistent occupation of medical resources and high mortality rate, which result in a heavy economic and social burden. Therefore, accurate prediction of ATC patient survival and understanding the drivers of these predictions are critical for clinically targeted therapy.
The known risk factors related to the prognosis of ATC include age, sex, race, marital status, insurance, socioeconomic status, level of education, tumor stage, tumor size, multifocality, surgery, radiotherapy, chemotherapy and so on [
6‐
11]. Additionally, the AJCC 8th edition reveals a better performance than the AJCC 7th edition TNM staging in predicting survival of ATC patients [
12]. Traditional methods for predicting survival of ATC patients are based on existing clinical and sociodemographic predictors, using Cox proportional hazards (Cox) regression analysis to establish nomograms [
12‐
16]. Although the estimated C-index calculated by some models appears to be ideal, there is still a risk of overfitting. With the rapid development of precision medicine, machine learning (ML) has been widely applied in medical fields such as outcome prediction, diagnosis, medical image interpretation and treatment [
17]. Applications of ML in thyroid carcinoma consist of diagnosis, nodule identification and risk factor analysis [
18‐
21]. However, rare data show applications of ML for prognostic analysis in ATC patients. ML does not need to assume the relations between input variables and outputs variables, as well as takes into account all possible interactions and effect corrections between variables [
22]. More importantly, ML is an efficient and accurate substitute of semi-parametric and parametric models.
In this study, we aimed to compare the application of Cox regression and ML algorithms for survival prediction among ATC patients. Strategies aimed at selecting most suitable predictive model could help clinicians to intervene risk factors timely and prescribe treatments properly, enhancing the understanding of decision-making process for assessing ATC.
Discussion
By comparing the prediction performance of different ML algorithms to the reference method (Cox regression), our findings suggested that Cox regression performed well as a conventional method for ATC survival prediction. Among ML algorithms, Logistic algorithm demonstrated the best performance. Combining SHAP values, Logistic algorithm illustrated key predictive factors and established a high-accuracy survival prediction model. In our study, we used the Cox regression model to identify the most influential predictors and create a nomogram to predict the risk of cancer outcomes for individual patients. The nomogram provides a user-friendly tool for clinicians to assess the risk of cancer outcomes and stratify patients into low- and high-risk groups, which is useful for clinical decision-making. Furthermore, we used the SHAP method to rank the importance of predictors and differentiate their impact on the risk of cancer outcomes. This approach provides a visual and intuitive way to identify protective and risk factors and guide clinical judgment and decision-making.
Our study solved the limitations of ML in predicting the prognosis of ATC survival by including more possible factors. We collected multifaceted disease-related predictors, such as baseline patient information, clinical diagnosis, medical therapy, surgery therapy and so on, we also extracted relevant variables which may influence the development of disease, such as economic condition and education. And the 8th edition of AJCC TNM staging criteria was finally applied to disease strategy for better performance. Our models showed a high C-index value, indicating a remarkable generalization ability and clinical value, providing distinct explanations helping to predict survival rate, which drove clinicians to understand the decision-making process for assessing disease severity.
Different from our study, other researchers tended to apply Cox regression and Logistic regression to analyze risk factors and constructed a predictive model. Gui et al. [
13] found that the important predictors for survival rate of ATC were age, historic stage, tumor size, surgery therapy, radiotherapeutic, as analyzed by multivariable Cox proportional hazard regression models. In terms of prediction performance, the nomograms showed a C-indexs value of 0.765 for OS, and 0.773 for CSS. Based on preoperative variables and postoperative variables, Qiu et al. [
16] constructed two prognostic nomograms, and the C-index were 0.6783 and 0.7029. The data for the above study were obtained from the SEER database. Meanwhile, a retrospective Study from Regional Registry studied 149 patients with ATC showed that age, tumor size, distant metastasis status were independent variables, definitely affecting survival rate of ATC, as analyzed by multivariable Cox proportional hazard regression [
27]. Traditional Cox regression is the most convenient way to solve most survival prediction problems because its results are easy to interpret. However, Cox regression models should be used with a minimum of 10 outcome events per predictor variable (EPV) [
28].
ML is an efficient and accurate substitute to semi-parametric and parametric models, with the advantages of high calculating efficiency and excellent performance. ML algorithms do not consider factors of non-proportionality, multicollinearity, or nonlinearity, reducing prediction bias caused by modeling uncertainty. Unfortunately, it’s application in the clinical practice is hindered by the lack of interpretability. Subsequently, SHAP comes into use, aiming to elucidate how the machine models run the output process in an easily understood term, and makes up for the disadvantages mentioned above. There has been no targeted application of machine learning algorithms to predict the survival of patients with anaplastic thyroid carcinoma (ATC). Here, we calculated subject-level survival curves by analyzing outcomes variables in binary model as well as time-event model, providing better understanding of predicted survival. The results of this paper indicated that the models built by ML incorporated fewer predictors and performed no worse than traditional Cox regression. As a substitute of Cox regression, the Logistic algorithm combined with SHAP values performed superiority in clinical applications. However, it is important to note that the predictive efficacy of Cox regression in predicting the survival of ATC patients were comparable with ML algorithms, suggesting that the superiority of ML was not always seen but was seen only in situations when the conventional methods meet their limits.
Deep learning is a branch of machine learning, which requires less data engineering and achieves more accurate prediction when processing a large amount of data. Deep learning has been applied in many fields of medical practice, including image diagnosis, digital pathology, cancer prognosis, etc [
29]. Previous studies have shown that the performance of deep learning model in predicting survival analysis is better than that of traditional Cox regression model [
30,
31]. We used the deep learning method, named DeepSurv, to predict the survival of ATC patients. The results show that the DeepSurv algorithm is better than Cox regression in the training set. However, no obvious advantages were seen in the validation set. It can be seen that deep learning is challenging in the application of cancer prognosis. The performance of the deep learning model depends on the amount of data [
32]. When the amount of patient data is relatively small, sub optimal performance and overfitting problems are usually seen.
Cox regression results showed that Age, Families below poverty, AJCC T 8th, AJCC M 8th, tumor size, surgery, radiotherapy and chemotherapy were important factors in predicting OS and CSS, among which therapeutic approaches were protective factors, including surgery, radiotherapy and chemotherapy. Importantly, older age, higher poverty rate, larger tumor size and more advanced stage suggested a poorer prognosis. Similarly, in the Logistic algorithm analysis, AJCC T 8th and AJCC M 8th were included as important factors in the survival prediction of ATC patients, which was consistent with previous research [
33,
34]. By evaluating SHAP values, we found that AJCC M 8th was the most important predictive factor, which is consistent with previous study [
13]. In our study, the AJCC.N.8th edition staging was not included into predictive factors. However, regional lymph node surgery was analyzed in the prediction of 6-OS and 6-CSS when using Logistic algorithm. In addition, studies have shown that log odds of positive LN (LODDS) showed better predictive performance than AJCC N states [
35]. Radiotherapy and surgery, as compared with control group, improved patient outcomes, being consistent with the findings of Gui et al. [
13]. In addition, we found that chemotherapy was also a protective factor for the prognosis of ATC patients.
This study has several limitations. First, this is a retrospective study with small sample size, which may cause bias. More large-scale prospective studies are needed to validate the efficacy of our models. Second, although we included more predictors than previous studies, such as economy, education and marriage, our study did not analyze the impact of immunotherapy and targeted therapy, which were highlighted in recent progress of ATC treatments [
23]. Finally, we did not perform performance comparisons with previously established predictive models because of differences in analyzing variables. In the future, we will try to build a deep learning model to predict the prognosis of ATC and conduct hierarchical researches, by analyzing more data and information.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.