Background
Colorectal cancer (CRC) remains the third leading cause of cancer-related deaths worldwide [
1]. Nearly 1.8 million CRC patients are initially diagnosed and 1 million CRC patients die every year [
2,
3]. Despite continuous efforts in prevention, screening, and management, the incidence of CRC was still increased by 38% from 2007 to 2017 [
2]. In addition, patients with the same clinical and pathologic conditions show contrasting clinical outcomes, even when treated similarly [
4]. The diverse prognosis of CRC patients might be due to the inherent genetic heterogeneity.
There is still no definite conclusion about the pathogenesis of CRC. However, a growing number of studies have shown that the epithelial-mesenchymal transition (EMT) plays an important role in invasion, metastasis, and chemoresistance [
5‐
9]. Even though the mechanisms of EMT have been extensively studied in CRC, the prognostic value of ERGs remains limited and inconclusive.
Considering the strong relationship between EMT and tumor pathogenesis, the aim of this study is to identify ERGs for cancer diagnosis, management, and prognosis. We initially screened differentially expressed ERGs between tumorous and nontumorous tissues, and then used Cox proportional hazard regression analysis to screen prognosis-related genes from 244 EMT-associated genes in a CRC cohort of The Cancer Genome Atlas (TCGA). The resulting genes were applied to the least absolute shrinkage and selection operator (LASSO) to establish an optimal risk model, followed by validation in an independent Gene Expression Omnibus (GEO) CRC population. The results showed that CRC patients with high EMT risk scores were obviously associated with shorter overall survival (OS) than that of patients with low risk scores. The difference in the key signaling pathways between high and low risk groups were explored using gene set enrichment analysis (GSEA). Taken together, our research constructs a nomogram to predict individuals’ survival probability by integrating clinical characteristics and the prognostic gene signature.
Discussion
Nowadays, CRC remains a major threat to human health, but the mechanisms underlying its pathogenesis are still unclear. However, it is significant for researchers to explore new diagnostic and therapeutic strategies. On the other hand, an increasing number of studies have widely proved that EMT plays an important role in the development and progression of CRC [
20]. Recently, mRNA gene signatures based on certain characteristics, such as metabolism [
21] and cell cycle [
22], have become research hotspots.
In this study, we collected the transcriptome data along with their corresponding clinical information from TCGA and GEO databases. Among these, we obtained the differentially expressed ERGs between CRC samples and nontumorous samples. Further analysis was performed to identify the oncogenes. Finally, a prognostic model for CRC patients was constructed. Interestingly, major differentially expressed ERGs were enriched in several cancer-related pathways-the Hippo signaling pathway, ERK1 and ERK2 cascade, negative regulation of response to DNA damage stimulus, and so on. Notably, it has been reported that the IL-17 pathway participated in autoimmune pathology or hypersensitivity, host defense, and tissue repair [
23]. Consistent with previous findings [
24,
25], we predicted that the IL-17 pathway might be involved in the EMT process through KEGG enrichment. Interestingly, IL-17 upregulated PD-L1 protein expression in HCT116 and LNCaP cells, as reported in previous literature [
26]. Therefore, targeting this pathway could not only inhibit the tumor metastasis, but also enhance the killing effect of immune cells on tumors.
Eleven genes were used to establish the model equations for risk assessment. Among them, three candidate genes (FOSL1, PLS3, SNAI1) were reported to promote CRC cell migration and invasion. FOSL1 plays a central role in EMT and is highly expressed in solid cancers, especially in metastatic CRC. In vitro studies showed that blocking the expression of FOSL1 could diminish the migration of tumor cells [
27]. Mimori et al. confirmed that PLS3 induced EMT via transforming growth factor (TGF)- β signaling, followed by the acquisition of invasive ability in CRC cells. Furthermore, overexpression of PLS3 in CRC cells significantly increased the expression levels of EMT-related transcription factors (TWIST, SNAI1, SLUG, SMAD4, and ZEB1), EMT markers (vimentin, FN1, and N-cadherin), and TGF-β, enhancing the invasiveness of CRC cells [
28]. In addition, previous studies have demonstrated that high expression of PLS3 in peripheral blood was independently associated with poor prognosis and recurrence [
29]. Wang et al. identified that SNAI1 was not detected in normal colorectal epithelia, whereas SNAI1 was upregulated in tumor tissues from lymph node (LN) + patients [
30]. Similar studies have found that SNAI1 was upregulated in CRC, which might have potential in the control of metastasis and possibly serve as a target for chemopreventive agents [
31]. Data from Gene Expression Profiling Interactive Analysis (GEPIA) revealed that a high expression of TRAP1 was correlated with a good prognosis in CRC. However, researchers have already observed that TRAP1 was significantly upregulated in CRC patients with lymph node metastasis compared to those without LM metastasis [
32]. Using RT-qPCR detection of CRC in different tumor stages, Scorilas et al. found that the CLU mRNA expression levels were significantly enhanced as CRC tumors progressed from tumor node metastasis (TNM) stage I to IV [
33]. Further in vivo and in vitro experiments focusing on TRAP1 and CLU are still needed to explore their roles in CRC.
Of note, contrary to our research, Zhou et al. confirmed that decreased expression of IGFBP3 promoted tumor metastasis in CRC [
34]. Another study indicated that silencing IGFBP3 in two human CRC cell lines, SW480 and Caco2, could reduce the proliferation, colony formation, and migration. They found that the expression levels of IGFBP3 simultaneously increased with the growth and advanced stage of CRC [
35]. Our studies, however, showed that increased IGFBP3 expression was associated with a poor prognosis in CRC patients. Considering the inconsistent results, further experiments are still required. As a potential immune stimulator, CCL19 has been observed to be increased in lung cancer, and an association between CCL19 expression and high TNM staging and vascular invasion was identified [
36]. CCL19 enhances parenchymal central nervous system (CNS) retention of lymphoma cells (LCs), thereby promoting central nervous system lymphoma (CNSL) formation [
37]. Xu et al. identified that CCL19 suppressed angiogenesis in CRC via promoting miR-206 [
38]. However, further study will be required to uncover and understand its mechanisms in the metastasis of CRC.
FSTL3 was upregulated by the lncRNA DSCAM-AS1/miR-122-5p axis and could promote proliferation and migration of non-small cell lung cancer cells [
39]. Moreover, FSTL3 served as a surrogate marker in breast cancer and was the only variable that could distinguish a benign breast mass from a malignant one [
40]. One report indicated that astrocytic HAND1 was found to be unique in metastatic gastrointestinal stromal tumor (GIST) and might work as a transcriptional amplifier of the oncogenic GIST program [
41]. There are few studies on these two genes in CRC. Further research on these genes is required. It should be noted that SCG2 and PCOLCE2 have been predicted to be associated with the prognosis of CRC, but in-depth investigation on these two genes in CRC is rarely reported [
42,
43]. It is necessary to explore their roles in tumors, especially in CRC.
So far, most of the cancer-related genes identified through bioinformatics methods were analyzed individually, which could not reflect the carcinogenesis process comprehensively. However, we generated a multigene signature predicting the prognosis of individual CRC patients, focusing on the ERG sets. Nevertheless, this research also has some imperfections. First, we examined data from public databases, so the quality could hardly be guaranteed. Second, the study could be more valuable if further experiments in CRC cells and animal models are performed on these genes. Finally, most of the data we studied were obtained from the United States or Europe. Due to the limited origin of the data, they might not be able to reflect all persons worldwide. Therefore, future research is needed to validate our findings.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.