Background
Endometrial cancer (EC) is one of the three major gynecological malignancies and fifth most common cancer among women (4.8% of female cancer cases) in the United States [
1]. It is expected that 61,880 new cases will be diagnosed in 2019 (7% of all female cancer cases) and 12,160 deaths (4% of all female cancer deaths). In the past ten years, with the irregular use of hormones and changes in people’s living environment and lifestyle, the prevalence and mortality of endometrial cancer in China and abroad have been increasing annually [
2]. Although most patients are diagnosed early, approximately 28% of patients are diagnosed with advanced disease. However, patients with the same degree of progression can show different prognoses and treatment responses. Therefore, effective EC biomarkers must be discovered for assessing prognosis and identifying potentially patients at a high risk of EC.
Numerous biomarkers for EC have been identified, such as the SIX1 and HER-2 genes [
3,
4]. With advancements in high-throughput sequencing, researchers have established various patient genome databases to enable a more systematic understanding of genomic changes. Through database mining, we identified thousands of biomarkers that may be associated with the prognosis of patients with tumors [
5,
6]. However, the predictive ability of single-gene biomarkers remains insufficient. Studies have shown that the evaluation of genetic traits, which involve multiple genes, may improve prognosis prediction [
7,
8]. Multigenic prognostic features from primary tumor biopsy can guide more specific treatment strategies. Recent studies have explored the effects of multiple-gene signature on EC for assessing prognosis and identifying potentially patients at a high risk of EC [
9,
10].
In this study, genes were selected by performing gene set enrichment analysis (GSEA). To identify biomarkers, differential analysis typically involves comparison of the expression differences between groups and focuses on genes whose expression levels are significantly regulated. However, this method can easily exclude genes that do not show obvious expression differences but may provide important biological information or exhibit biological significance. As an emerging computational method, GSEA does not require a clear differential gene threshold or extensive experience to test the overall expression of several genes. It reveals general trends in the data. Therefore, this approach improves the statistical analysis between biological expression and biological significance [
11].
Accordingly, in the present study, we aimed to explore data from The Cancer Genome Atlas (TCGA) to identify a new genetic signature for predicting the prognosis of EC. To this end, we used mRNA expression data from TCGA to map the marker genome of 548 patients with EC. We identified 119 mRNAs significantly related to glycolysis and developed a nine-gene risk profile for effectively predicting patient outcomes. Interestingly, the risk factors associated with glycolysis can be used assess prognosis of high-risk patients independently. A novel cell glycolysis-related gene signature was identified and validated.
Methods
We extracted clinical data and the mRNA expression profiles of patients with endometrial cancer from TCGA (
https://cancergenome.nih.gov/) [
12]. The study included clinical information from 548 patients and enrolled matching age, stage, grade, radiation therapy, residual tumor, histological type, diabetes, new tumor events, and hypertension (Table
1).
Table 1Clinical pathological parameters of patients with Endometrioid cancer in this study
Age |
≥ 66 | 236 | 43.2 | 47 |
< 66 | 310 | 56.8 | 40 |
Neoplasm cancer status |
With tumor | 79 | 15.5 | 48 |
Tumor free | 431 | 8.5 | 35 |
Residual tumor |
R0 | 376 | 94.5 | 47 |
R1 | 22 | 5.5 | 5 |
Stage |
I | 341 | 62.2 | 29 |
II–IV | 207 | 37.8 | 58 |
New event |
No | 485 | 88.5 | 54 |
Yes | 63 | 11.5 | 33 |
Grade |
G1 | 99 | 18.4 | 2 |
G2 | 122 | 22.7 | 14 |
G3 | 316 | 58.6 | 65 |
Histological type |
Endometrioid adenocarcinoma | 411 | 75 | 46 |
Serous adenocarcinoma/mixed | 137 | 25 | 41 |
Radiation therapy |
No | 517 | 94.3 | 84 |
Yes | 31 | 5.7 | 3 |
Diabetes |
No | 533 | 97.3 | 86 |
Yes | 15 | 2.7 | 1 |
Hypertension |
No | 517 | 94.3 | 85 |
Yes | 31 | 5.7 | 2 |
Gene set enrichment analysis
We performed GSEA (
http://www.broadinstitute.org/gsea/index.jsp) to determine if the identified gene sets were significantly different between the EC and normal groups. Next, we analyzed the expression levels of 24,991 mRNAs in EC samples and in adjacent noncancerous tissues. Finally, we determined functions for subsequent analysis by using normalized p values (p < 0.05).
Data processing and risk-parameter calculation
Log2 transformation was used to normalize each mRNA from among the expression profiles. Univariate Cox regression analysis was used to identify genes associated with overall survival (OS), which were then subjected to multivariable Cox regression to confirm the genes related to prognosis and obtain the coefficients. The selected mRNAs were then divided into the risky (hazard ratio, HR > 1) type and protective (0 < HR < 1) type. By linearly combining the expression values of filtered genes weighted by their coefficients, we constructed a risk-parameter formula as follows: Risk parameter = ∑ (βn × expression of gene n). Using the median risk parameter as a cut-off, the 548 patients were divided into high‐risk and low‐risk subgroups.
Specimens and patients of quantitative real-time (qRT)-PCR
A total of 20 EC tissues and 20 normal endometrial tissues were obtained from patients at the Department of Gynecology and Obstetrics, Shengjing Hospital of China Medical University, China. Normal tissues were obtained from patients who underwent hysterectomy for endometrial-irrelevant diseases. All patients provided informed consent, and this study was approved by the Ethics Committee of Shengjing Hospital of China Medical University. Histological diagnosis and grade were assessed by experienced pathologists in accordance with the FIGO 2009. No patient was administered systemic treatment preoperatively.
RNA extraction and qRT-PCR
Total RNA was extracted from tissues using TRIzol reagent (Vazyme, Nanjing, China). PrimeScript RT-polymerase (Vazyme) was used to reverse-transcribe cDNAs corresponding to the mRNAs of interest. qRT-PCR was performed using SYBR-Green Premix (Vazyme) with specific PCR primers (Sangon Biotech Co., Ltd, Shanghai, China). Glyceraldehyde-3-phosphate dehydrogenase was used as an internal control. The 2
−ΔΔCt method was used to calculate fold-changes. Primer sequences are listed in Additional file
1: Table S1.
Statistical analysis
We used Kaplan–Meier survival curves and the log‐rank method to estimate the significance of the risk parameter. We performed multivariate Cox analysis and data stratification analysis to test whether the risk parameter was independent of the clinical features, including age, grade, stage, new event, residual tumor, and neoplasm cancer status, which were used as covariates. A p < 0.05 was considered as statistically significant. Statistical analyses were performed using GraphPad Prism7 software (GraphPad, Inc., La Jolla, CA, USA) and SPSS 20.0 software (SPSS, Inc., Chicago, IL, USA).
Discussion
Recent studies showed that clinicopathological features such as age and metastatic diagnosis are not sufficient to precisely predict the outcome of patients with cancer. Thus, an increasing number of mRNAs have been identified as biomarkers of tumor progression or prognosis, and the clinical significance of the biomarkers has been evaluated [
14]. For example, Nadaraja et al. [
15] confirmed that low expression of ARAP1 is an independent prognostic biomarker of shorter progression-free survival in older patients with ovarian high-grade serous adenocarcinoma being administered first-line platinum-based antineoplastic therapy. Similarly, multivariate Cox proportional hazards regression model analysis was used to verified that patients with cervical cancer who had high tumor protein p73 expression had better outcomes, and thus this protein was considered as a prognostic indicator in patients with cervical cancer [
16]. However, these biomarkers were still not sufficient for independently predicting patient prognosis. Particularly, single gene expression levels can be affected by multiple factors, preventing these markers from being used as reliable and independent prognosis indicators. Thus, a statistical model comprised of genetic markers for multiple related genes, combined with the predictive effect of each constituent gene, was used to improve prediction. The model is significantly more accurate than using single biomarkers in assessing the prognosis of patients with tumors [
17,
18], leading to widespread use of the model.
The rapid development of high-throughput genetic sequencing technology has established a foundation for large biological data research [
19]. Large amounts of genomic data were extracted from individual specimens to identify new diagnostic, prognostic, or pharmacological biomarkers [
20]. In recent studies, a new prognostic signature was constructed by using microarray and RNA-sequencing data for gene expression levels or mutations. A Cox proportional hazards regression model was used for identification and verification [
21,
22]. In the current study, we identified 10 functions showing significant differences in GSEA. As described above, rather than wide-range exploration, we selected the top‐ranking function to filter genes related to patient survival prediction. Univariate and multivariate Cox regression analyses were performed to determine the prognostic value of the combination of nine genes for patients with EC. This selected risk profile may be a more targeted and powerful prognostic assessment for predicting positive clinical outcomes and may be a more effective classification tool for patients with EC compared to other known prognostic assessment markers.
In this study, bioinformatics methods were used to explore the characteristics of mRNA risk factors and their clinical significance, and a new method for mining of potential prognostic markers was explored. This study y complements the previous understanding of EC and provides a foundation for future EC research. We used the EC dataset in TCGA to collect glycolysis-related genes and compare data from normal and EC tissues. Kaplan–Meier survival estimates revealed that patients with low-risk parameters had a better prognosis. The detection and calculation of risk parameter in EC patients have important clinical implications. However, because of the lack of patient metastasis and recurrence information in TCGA database, we could only use OS to assess patient prognosis, which is one limitation of our research. Additionally, in stratified analysis, the risk parameter could predict the prognosis of patients with EC in all subgroups except for the subgroup of age < 66 years. The reason for this difference is unclear requires further examination.
In addition, the nine-gene signature and same analysis method in liver cancer and colon cancer were used to obtain and verify the corresponding risk parameter (Additional file
3: Figure S2; Additional file
4: Figure S3). The results showed that the risk parameter based on the nine genes is not an independent prognostic indicator for liver cancer and colon cancer, confirming that the nine-gene signature is particularly important in EC.
Tumors are characterized by uncontrolled cell proliferation, which not only eliminates control of the cell cycle but also promotes cellular energy metabolism and finally leads to tumor cell growth and differentiation. Cellular energy is mainly derived from sugar metabolism, and most energy is supplied by ATP. In the 1920s, the German biologist Otto Warburg discovered abnormalities in energy metabolism in hepatoma cells. Although oxygen is present, tumor cells mainly rely on glycolysis for metabolism and consume large amounts of glucose accompanied by lactic acid production. This phenomenon of abnormal glucose metabolism was named as aerobic glycolysis or the Warburg effect [
23]. Studies have shown that tumor cells can precisely regulate ATP synthesis by regulating substrate uptake and enzymes related to glycolysis, enabling them to adapt to the nutrient microenvironment, meet the energy and nutrient requirements for malignant proliferation, rapidly proliferate. Moreover, cancer metabolic reprogramming, which is closely associated with the Warburg effect, plays an important role in maintaining the interaction between oxygen-sensing transcription factors and the nutrient-sensing signal pathway [
24]. This indicates that aerobic glycolysis uses a complicated mechanism of action. Tumor cell proliferation proceeds at a pace exceeding cellular energy supply, and thus excessive consumption of oxygen and nutrients by the cells can cause the tumor microenvironment to be hypoxic, low in sugar, and acidic, which is more pronounced in solid tumors [
25]. Although not all tumors exhibit the Warburg effect, cellular energy abnormalities are widely recognized as one of the characteristics of tumor cells. After more than 90 years of continuous exploration and research, the Warburg effect has been found to occur in many malignancies, such as lung cancer, breast cancer, colon cancer, and gastric cancer. Recent studies showed that aerobic glycolysis plays an important role in EC occurrence and development. Metabolic profiling of EC cells revealed higher rates of glycolysis and lower glucose oxidation, and tumor cells may rely on GLUT6-mediated glucose transport and glycolytic–lipogenic metabolism for survival [
26]. Highly differentiated EC showed significantly lower GLUT1 and GLUT3 expression than poorly differentiated tumors [
27]. Several studies have predicted the survival of patients with EC using genes associated with cellular glycolysis. For example, high mobility group protein 1 suppression effectively inhibits the development and progression of EC [
28]. The expression of lactate dehydrogenase 5 in EC is an independent prognostic indicator strongly associated with poor prognosis [
29]. However, glycolysis-related gene markers for predicting EC prognosis have not been established. Using bioinformatics methods, we determined the genetic characteristics associated with cellular aerobic glycolysis (
CLDN9,
B4GALT1,
GMPPB,
B4GALT4,
AK4,
CHST6,
PC,
GPC1, and
SRD5A3) and demonstrated their prognostic value in EC.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.