Introduction
It is established among medical professionals that what terminally ill patients need the most are: truth, touch, and time [
1]. Specifically, they want their family and physicians to be truthful with them regarding their disease and its progress and treatments. Patients also want to be touched and be reminded that they are loved and valuable [
1,
2]. Most importantly, patients wish to have more time. They need time to accept their illness and losses and resolve various issues arising from their upcoming death [
1]. Maintaining quality of life as symptom free as possible is essential to optimizing their remaining time [
3]. Treatment with palliative chemotherapy (PC) for patients with metastatic gastric and esophageal cancers has traditionally been the standard of care. The treatment paradigm also includes Her-2 targeted therapy and immunotherapy based on PDL1 status. PC has resulted in relatively modest survival benefits with the risk of significant side effects including but not limited to hair loss, extreme weakness, nausea, and mucositis. Currently, there is no biomarker or other predictor for responses to PC in patients with gastric and esophageal cancers [
4]. In other words, only time shows whether PC is beneficial or not. In fact, the most common practice for evaluating the patients’ response to chemotherapy is to perform imaging after approximately two to three months of the treatment. Signs of clinical improvement in addition to imaging surveillance are also used to assess for evidence of treatment response. Clinical signs of treatment benefit such as improved appetite, weight gain, improved energy, and less dysphagia are monitored. Although these signs are correlated with effectiveness of PC, one cannot guarantee the improved clinical signs will result in an overall survival improvement. Unfortunately, about
\(30\%\) of the patients do not live more than three months after the initial diagnosis [
5‐
7]. Consequently, early prognosis prediction for them is highly critical. In fact, many of these patients suffer from the toxicity of PC through their final days. If these patients had known their outcome ahead of time, they could have instead tried a potentially beneficial second-line treatment. Alternatively, they may have simply chosen to terminate the treatment and spend their remaining time more peacefully. The fact that these patients do not have enough time remaining for a trial and error approach to treatment makes early prognosis prediction for them crucial.
Several studies have attempted to improve our understanding of the prognosis of gastric and esophageal cancer. These studies mainly examine the correlation between survival time and patient or tumor characteristics, including but not limited to age [
8], HER2 overexpression [
9], sex [
10], tumor size [
11], metastasis sites [
12‐
14], and laboratory variables [
15]. Although these studies offer insights on what factors may lead to poor prognosis, they do not provide
response predictions that can be used by oncologists or patients to make the decision to cease or continue PC. Few studies have attempted survival prediction for gastric and esophageal cancers, let alone cancers at stage-IV. In this work, we investigate the use of machine learning to predict survival for patients with stage-IV gastric and esophageal cancers. Machine learning tools, because of their ability to capture complex non-linearity within data [
16], have proven to be capable in cancer risk prediction [
17], detection [
18,
19], classification [
20,
21], and prognosis prediction [
22,
23]. Nevertheless, only handful of studies have leveraged machine learning to predict prognosis in gastric and esophageal cancers. These studies mainly predict long-term survival (i.e., 3 or 5-year survival) based on serum markers, immunomarkers and clinicopathological parameters [
24‐
27]. In fact, the literature lacks studies focusing on patients with the metastatic disease even though prognosis prediction is more vital for these patients given their limited time. Moreover, none of the existing research examines the
response to chemotherapy. Specifically, the available survival prediction models use the value of biomarkers at diagnosis time and do not consider their evolution after chemotherapy treatment.
This work addresses a major gap in the literature. It targets one of the most vulnerable groups of cancer patients and aims to improve the quality of their remaining life by predicting the response to chemotherapy and its benefit in terms of survival extension. To this end, we investigate the possibility of predicting survival at the time of diagnosis and after two cycles of PC treatment. To the authors’ best knowledge, this is the first attempt in the literature to predict survival based on blood biomarkers for stage IV-patients in early stages of treatment.
We find that, with the information on metastasis sites and blood biomarkers at the time of diagnosis, machine learning models can predict whether a patient would survive beyond 6 month or 9 months, if they go through PC, with more than 75% accuracy. The accuracy of prognosis predictions increases to more than 85% when updated blood markers after two cycles of PC are included in the prediction models. Our hope is not only to leverage information about the evolution of blood markers for making early decisions on continuing, altering, or ceasing PC for end-stage gastric and esophageal cancers, but also to provide a blueprint for similar investigations in other types of cancers.
Discussion
Gastric and esophageal cancers are the fifth and eighth most common cancers worldwide [
34,
35]. The prognosis of these cancers are favorable only in the early stages. However, more than a third of patients are diagnosed with stage-IV disease [
5] as these cancers are mostly asymptomatic during the early stages. Consequently, the overall 5-year observed survival rate for patients diagnosed with gastric and esophageal cancers are among the lowest for all cancers. Specifically, the 5-year survival rate for metastatic gastric and esophageal cancer is less than 5% [
5,
6].
Considering the short life expectancy of stage-IV patients with gastric and esophageal cancers, PC is established as the standard first-line treatment. PC is associated with modest survival benefits and patient quality-of-life improvement [
36]. However, the response to PC varies significantly among the patients. This means that not only are there patients that do not necessarily benefit from PC but also their quality of life deteriorates significantly because of the substantial and cumulative side effects of chemotherapy [
37]. It is estimated that between 20% to 50% of terminally ill cancer patients undergo chemotherapy in the last thirty days of their lives without any clear benefits, and in many cases experience significant toxicities, financial costs, and decreased quality of life [
38]. Moreover, receiving PC has been shown to be associated with higher rates of cardiopulmonary resuscitation and mechanical ventilation in the last week of life. Also, patients receiving PC were more likely to die in an intensive care unit rather than in their preferred place [
39]. End-stage patients with gastric and esophageal cancers can, therefore, tremendously benefit from early predictions of their prognosis. Such a prediction can help the patients and their physicians in the timely termination of PC when it is not beneficial, thereby improving the patients’ end-of-life quality. Patients and their care team may also leverage early prediction of response to PC to switch to a possibly more effective second-line treatment.
Unfortunately, despite the significance of implications of a reliable and easy-to-use response prediction model for patients with metastatic disease, few studies address this topic. First, the majority of available work concerning survival analysis investigates only the correlation between survival and potential biomarkers [
8‐
10,
12‐
15,
40], without providing rigorous survival predictions. Second, studies that do attempt to predict survival do not focus on patients with metastatic disease [
24‐
27]. The accuracy of a prediction model developed using data from patients in different stages are likely not sufficient for end-stage patients with shorter survival time. Moreover, none of the existing works has investigated the advantages of incorporating the changes in blood markers after a few cycles of PC in predicting the response to PC.
This work marks the first study predicting the response to PC for patients with metastatic gastric and esophageal cancer. Unique to this study, we use data from a comprehensive metabolic panel and complete blood count tests after two cycles of chemotherapy. We find that blood markers, including WBC, Albumin, Creatinine, and total Bilirubin after two cycles of chemotherapy are among predictors of prognosis. Different machine learning tools are able to provide highly accurate predictions (with AUC above 85% shown in Fig.
6) given the same predictors. The consistency of results among different machine learning models adds confidence in the true predictive ability of the input variables.
One primary advantage of the developed prediction model is its dependence solely on the metastasis sites and blood markers. Blood laboratory tests are performed regularly for patients undergoing chemotherapy (e.g., normally before each session of chemotherapy). The blood markers are, therefore, easily accessible and can be readily used in prediction models. Recently, estimating survival based on radiomic features from CT scans or tumor genomic profiles has garnered more attention for more common types of cancers (e.g., lung and breast cancer) [
41‐
45]. This is despite the fact that both radiomic features and tumor genomics are not readily available. Specifically, extracting radiomic features (which contain distinct attributes associated with attenuation, shape, size, location, intensity, and texture of tumors) include two non-trivial steps: 1) segmenting the region of interest in CT images using manually delineated contours [
46] or (semi) automatic packages such as 3DSlicer [
47], 2) passing region of interest into software programs (e.g., TexRAD [
48] and MaZda [
49]) that extract the desired features. As for tumor genomic profile, the cost of tumor genomic profiling can be up to $100,000 per patient [
50]. Consequently, specialized expertise and high monetary cost prohibit the wide utilization of prediction models based on radiomic or genomic features. This further highlights the importance of developing prediction models that are practical, given the availability and cost of existing technologies. Nevertheless, future research could explore the possibility of improving the prediction accuracy using radiomic or genomic features, in hope that extracting these features becomes less cumbersome and costly with advancements in technology.
We also want to call attention to the fact that it is very hard and time-consuming to obtain cancer patient data. There is limited availability of public sources for detailed cancer patient data (i.e., data with information beyond gender, sex, and age of patients). In fact, to author’s best knowledge there is no public dataset that includes the gastric or esophageal patients’ blood biomarker during their course of treatment. For this reason, the majority of studies in the literature that are focused on survival analysis or prediction use data from a specific hospital, which limits the sample size. The small sample size excludes the potential use of some machine learning algorithms (such as deep learning) that require extensive training with large datasets. Moreover, the private status of the data makes the validation or replication of the results impossible for other researchers. As a step forward to address this issue, we share the anonymous data used in this work on GitHub (see data availability statement) and encourage other researchers in the field to do the same in order to maximize the utility of the conducted research. Integration of datasets from various hospitals would provide innumerable opportunities to substantially advance the state-of-art in the field of cancer prognosis prediction.
Lastly, it must be noted that the data set used in this study lacked information about details of chemoteraphy regimens. It might be possible to further improve the results by considering this additional information. Moreover, relatively recent evidence supports standard first-line treatment for stage IV esophageal and gastric cancer as combined chemotherapy and immunotherapy in subsets of patients based upon PDL1 status. Currently, we do not have enough data to evaluate the performance of predictions models in predicting survival for patients receiving chemoimmunotherapy. Nevertheless, our results show that it is possible to predict how long a patient would survive at very early stages of treatment. As a future research path, we will study data from various institutions and will develop prediction models for different treatments (i.e., PC +/- immunotherapy). Such models built on large multi-intuitions data for different types of treatment can tremendously help patients and their care team to make an informed decision about their choice of treatment, and its continuation or termination.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.