Background
According to the latest GLOBOCAN [
1] report, colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide (10.2%) and has the second-highest mortality rate (9.2%). Approximately 145,600 new colorectal cancer cases occur each year in the United States, among which 101,420 cases are colon cancer, and the remainder is rectal cancer [
2]. In recent years, colon cancer mortality has continued to rise in many countries with limited resources and health infrastructure, particularly in South America and Eastern Europe [
3]. Colon adenocarcinoma (COAD) is the primary pathological type of colon cancer. Surgery combined with postoperative chemotherapy is currently the main treatment for COAD. However, the survival of COAD has improved due to the continuous advancement of surgical technology. However, postoperative recurrence and chemotherapy resistance remain two major obstacles to the long-term survival of patients [
4‐
6].
With the development of high-throughput omics, various omics techniques, such as whole-genome sequencing, epigenomics, and proteomics, have been applied to study COAD [
7‐
10]. Increasing evidence has shown that COAD is not a consistent disease type but a molecularly heterogeneous disease comprising a series of genetic changes [
11]. Tumor heterogeneity can alter the tumor growth rate, invasive ability, sensitivity to drugs, prognosis and other aspects, making it one of the main obstacles affecting tumor treatment [
12,
13]. Therefore, dividing patients with COAD into different risk groups based on gene expression profiles helps to predict the risk of tumor progression or metastasis and recurrence and is a necessary prerequisite for proper individualized treatment [
14‐
16].
There is increasing evidence that the immune system plays an important role in the occurrence and development of cancer [
17‐
19]. For example, Salem M [
20] found that disrupting the cell surface receptor glycoprotein-A repetitions predominant (GARP) on activated regulatory T (Treg) cells reduces immune tolerance and the development of colon cancer. In recent years, a method based on the relative ranking of gene expression levels was proposed to eliminate the shortcomings of data standardization and scaling in gene expression data processing, achieving reliable results in various studies [
21,
22]. The present study selected immune genes that are significantly associated with the prognosis of COAD. Next, we integrated these genes to construct an immune-related gene pair (IRGP) risk model and verified its feasibility as a prognostic marker for COAD.
Discussion
Colon cancer is the most common type of gastrointestinal cancer and has high morbidity and mortality. Approximately 95% of colon cancer is colon adenocarcinoma (COAD). In recent years, immunotherapy has been a hotspot in the research of major tumor types. In the COAD field, studies on the high-level microsatellite instability (MSI-H) population have been performed successively since 2015. The Keynote 016, Keynote 164, Checkmate 142, and NICHE clinical trial results all indicate the extraordinary efficacy of immunotherapy [
36‐
39]. Patients with MSI-H have a better prognosis than those with microsatellite stability (MSS). However, the MSI-H population accounts for only approximately 10% of COAD. Most patients still face the dilemma of not having an effective prognostic indicator. Thus, the determination of new prognostic biomarkers is urgent to predict the survival of colon adenocarcinoma patients.
To obtain the robustness of the prognosis prediction in this study, we adopted a method for data analysis without considering the technical deviation of different platforms. The newly established prognostic model is based on the ranking and pairing comparison of relative gene expression values; thus, data preprocessing, such as scaling and normalization, is not required. This method has reliable results in many studies [
40,
41].
In this study, we identified an immune-related gene pair model to predict the overall survival for colon adenocarcinoma. The prognostic model comprises 17 immune-related gene pairs containing 26 unique immune-related genes. Most genes in this immune model are cytokine receptors and cytokines, which play a vital role in the adaptive immune response. Among these IRGs, no evidence supports that the overexpression of IL17RB can enhance the invasion and metastasis of thyroid cancer cells [
42]. STC2 overexpression is associated with a poor prognosis in patients with nasopharyngeal carcinoma (NPC) and can be used as a predictor of NPC responses to radiation [
43]. The increase in IL-7 in colorectal cancer (CRC) is related to metastatic disease and tumor location [
44]. Decreased CXCL14 expression indicates a poor prognosis and causes metastasis in colon cancer [
45]. GRP signaling alters the invasion of colon cancer through heterochromatin protein 1
Hsβ and can improve the prognosis of patients with colon cancer [
46]. Moreover, regulatory T cells (Tregs) and M0 macrophages are related to the poor clinical prognosis of many patients with cancer [
47,
48]. Dendritic cells are associated with cancer immunity and a favorable prognosis [
49]. At the same time, the immune cell types M0 macrophages, M1 macrophages, monocytes, neutrophils, CD8 T cells and follicular helper T cells in the high-risk group of GSE39582 are all related to tumor progression and poor prognosis [
50‐
53]. These findings are consistent with our results. In this study, we also found that several expression characteristics of genetic perturbations, such as increased stem cells, increased breast cancer ductal invasion, a multicancer invasiveness signature, increased advanced vs early gastric cancer and increased mammary stem cells, were related to the IRGP model. These results were verified by corresponding experiments [
54‐
58], confirming their importance in tumor development and cell growth. These findings indicate that the IRGP model may play an essential role in tumor invasiveness and progression in COAD.
The difference between this study and previously published studies [
59] is that the IRGP model was established based on the TCGA database. Second, our strategy to establish a prognostic model was different. To screen out immune-related gene pairs that are significantly related to OS in patients with colon cancer, we used univariate Cox regression analysis before determining the final model using Lasso regression analysis. Finally, we conducted GSEA in the training and validation cohorts to further analyze the specific differences between the high- and low-risk groups. We found that the high-risk group genes were significantly enriched in tumor cell invasion and growth.
Similar to all RNA-seq and microarray analyses, our study had limitations. First, the training dataset to build the immune model was obtained from a retrospective study, which included fresh frozen samples; the stability and efficiency of formalin-fixed and paraffin-embedded (FFPE) samples remain questionable. Therefore, it may be necessary to add more datasets with different sample attributes for more extensive verification. Second, because the prognostic model was based on TCGA and other databases, it required proficiency in bioinformatics. Additionally, the gene expression profiles produced by RNA-seq or microarray platforms require high prices and long conversion cycles. Therefore, this method is challenging to popularize in daily clinical applications.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.