Background
Malignant tumors are characterized as exhibiting unlimited multiplication, evasion from growth and evading immune destruction [
1], all of which are pathogenically related to the tumor microenvironment (TME) [
2]. A healthy microenvironment inhibits carcinogenesis and metastasis, whereas a cancerous microenvironment may promote neoplastic development [
3]. The TME consists of a complex network of various intracellular and extracellular components that play an indispensable role in cancer development and progression.
As a component of TME, immune and inflammatory cells have been shown to be closely associated with carcinogenesis. Inflammation has also been reported to be an important risk factor contributing to cancer development [
4‐
6]. It is thought that chronic inflammation, tumor-related inflammatory responses, and inflammation in the tumor environment in the context of intestinal dysfunction contribute to the carcinogenesis of intestinal malignancies [
7‐
9]. Immuno-inflammatory cell dynamics persist in the site of chronic inflammation, which has been proposed as the cradle for cancer development and progression [
6,
10,
11]. Therefore, the association between inflammation and immune cells can reflect the relationship of carcinogenesis and prognosis of patients [
12].
The role of immune cell infiltration and the differentially expressed genes associated with the infiltration in the remodeling of the colorectal cancer microenvironment has been of growing interest in the medical and scientific communities. To gain a more fundamental understanding of the molecular mechanism of TME remodeling in colon cancer progression, we propose here a computerized bioinformatics tool for identifying a candidate gene(s) from the Cancer Genome Atlas (TCGA) with regulatory functions in tumorigenesis.
Methods
Working samples
The transcriptome from the RNA-seq analysis of 524 colon cancer samples, including 42 normal samples and 482 tumor samples, with corresponding clinicopathological information were download from the TCGA database (
https://portal.gdc.cancer.gov/). We employed the ESTIMATE algorithm to calculate the ImmuneScore, Stromalscores, and ESTIMATEScore for each sample in the tumor microenvironment.
Survival analysis
After sorting the clinical data downloaded from the TCGA database, complete survival information of 455 cases was obtained with survival time ranging from 0 to 12 years. A Kaplan–Meier test was applied to plot the survival curve, while A log-rank test was used to compare the median of the survival times for the two different groups. A p value < 0.05 was considered statistically significant.
Differential expression analysis
All patients were divided into a high and low score group based on the median values of the ImmuneScore and StromalScore. The linear models for the microarray data (LIMMA) package were further utilized for the differential analysis of gene expression. In comparing the two groups, the differentiation of gene expression of each group with more than a one-fold change following a log2 transformation was considered statistically significant at a p value threshold of 0.05 after false discovery rate (FDR) correction. The differentially expressed genes were plotted as heat maps using the Heat map package of R software.
GO and KEGG enrichment analysis
The genes obtained through the differential expression analysis were further analyzed with the R software using the clusterProfiler, enrichplot, and ggplot2 packages to identify those that were significantly enriched [
13]. Significance thresholds were set a 0.05 for both p and q value.
Differential analysis of scores with clinical stages
Clinic-pathological data of the colon cancer samples were obtained from TCGA and further analyzed with the R software package. A Wilcoxon rank-sum or Kruskal–Wallis rank-sum test was used for establishing significance.
Construction of PPI network
The STRING database was used to predict a PPI, which was reconstructed with the Cytoscape v3.6.1 software. The connectional nodes for constructing the network were the ones with a confidence of interactive relationship of more than 0.95.
COX regression analysis
Univariate COX regression was performed with the R software. With p values from the Cox regression analysis, the top 24 genes were plotted according to a ranking from small to large.
Gene set enrichment analysis
The KEGG pathway gene set (C2.cp.kegg.v7.1.symbols.gmt) was acquired from the Molecular Signatures Database (MSigDB) as the target set. Whole transcriptomes of all tumor samples underwent gene set enrichment analysis (GSEA) using the gsea-3.0 software from Broad Institute. Through GSEA, the gene sets with NOM p < 0.05 and FDR q > 0.06 were processed for the next round of analyese.
Immune cell infiltration
The CIBERSORT computational algorithm was applied for estimating the abundance of immune cell infiltration in all tumor samples. Candidate tumor samples with p < 0.05 were identified for more detailed analysis.
Statistical analysis
All statistical analyses were performed with the R software (version 3.5.2). A Student’s t-test was used to compare the differences between the two variables and a two-tailed p < 0.05 was considered statistically significant.
Discussion
The relationship between immune cell infiltration and cancer development has been widely reported in literature [
14‐
16]. However, the correlation between immune cell infiltration and tumor prognosis remains controversial for colorectal cancer. Studies have shown that the poor prognosis of colon cancer is either positively or negatively interdependent with tumor-associated neutrophils [
17,
18], and Some studies have shown that Tregs can promote the prognosis of CRC [
18,
19], while others have identified Tregs as a risk factor for CRC [
20,
21]. Other immune cells have also been reported [
22,
23]. Moreover, some studied reported that adipocytes in stromal cells can induce epithelial mesenchymal differentiation of tumors, subsequently promoting tumor metastasis [
24]. By contrast, some factors secreted by stromal cell might also regulate tumor cell metastasis, apoptosis and other processes [
25,
26]. However, this study did not prove the association of stromal cells with prognosis, stage and TMN of the CRC patients, and also rejected the impact of immune cells on tumor microenvironment in our study. Ye L et al. also performed a similar analysis with 1008 colon cancer samples from both TCGA and GEO databases and suggested the association of immune cell infiltration with the prognosis of colon cancer [
22]. For this inconsistence, possible explanations could be due to the differences in database selection and sample size. Besides, immune and stromal cells contain numerous cell types, so the influence of the immune microenvironment on prognosis may vary from the perspective of these different cell types. More importantly, colorectal cancer cell types possess diversity, which might respond differently to any given immune microenvironment.
In addition to survival analysis, TMN classification is commonly applied in the clinic to assess tumor progression. Some studies have suggested that this classification does not account for immune status, so response to treatment may not be an effective predictor [
27]. Of course, some studies have also found that tumor-associated neutrophils, regulator T cells and tumor-associated macrophages are associated with undifferentiated colorectal cancer with advanced TMN classification [
28,
29]. Our study found that differences in the degree of infiltration of immune cells is a strong biomarker for classifying stage II vs stage III/IV, pN0 vs pN1, and pM0 vs pM1. We also found that even though the immune microenvironment is influenced by both stromal and immune cells, no difference existed in the pT, pM, and pN classifications, except for differences between stage II vs stage IV. This suggests that components in the tumor microenvironment possess different and refined functions.
Changes in the tumor microenvironment are determined by genes [
30]. For identifying the gene(s) associated with the remodeling of the cancerous microenvironment in the colon cancer progression, we propose here an algorithm for pipelining numerous available bioinformatics tools. This pipeline aims to analyze the genes with differential expressions that are congruous with the differences in the infiltration of stromal and immune cells. In short, GO enrichment analysis coupled with KEGG enrichment analysis identifies the genes relevant for immune-related factors in the immune microenvironment of colorectal cancer. Then, STRING coupled with Cystoscape constructs the involved genes into a PPI, which is coupled with Cox univariate regression analysis to predict the most likely candidate genes.
We first analyzed the GO data and found that most of the differentially expressed genes were related to the activation of T cells, migration of leukocytes and regulation of lymphocytes. Through KEGG enrichment analysis, the candidate genes were mainly related to the cytokine receptor interaction and chemokine signaling pathway, which are also immune-related pathways. Taken together, the regulatory role of genes in the immune microenvironment was confirmed. This finding is also consistent with the views of David Tamborero and other scholars [
30]. Eventually. The PPI and Univariate COX regression analysis were used to identify the two genes TGFB1 and SERPINE1. The comparison between cancer and normal samples showed no significant difference in TGFB1 expression; however, SERPINE1 expression was significantly different between cancer and normal samples, but had no significant effect on prognosis [
31]. The inhibitory effect of SERPINE1 expression on tumor cell apoptosis has been previously reported [
32,
33]. However, the relationship between SERPINE1 and immunity has been studied much less. We used GSEA to analyze the relationship between SERPINE1 expression and cancerous pathways and found that high SERPINE1 expression can promote tumor and immune-related pathway activation. This suggests that SERPINE1 can influence the occurrence and development of colon cancer by regulating the tumor immune microenvironment.
SERPINE1, known as the Serine Protease Inhibitor family E member 1 or plasminogen activator inhibitor-1 (PAI-1), has been proposed as the key player for carcinogenesis and poor prognosis [
32‐
34]. In previous studies, SERPINE1 promoted peripheral neo-angiogenesis, regulated endothelial homeostasis, and interacted with inflammatory factors [
33,
35,
36], suggesting that SERPINE1 may be related to the tumor microenvironment. However, the role of SERPINE1 in the tumor microenvironment with immune-related cells has not been reported in previous studies.
The role of SERPINE1 in the process of skin fibrosis has also been reported [
37]. Studies have shown that SERPINE1 plays multiple critical roles as a mediator of infiltration, adhesion, and activation of mast cells and fibroblasts in fibrogenesis. In the process of renal fibrosis, the decrease of SERPINE1 expression is also associated with the decrease of neutrophils and macrophages [
38,
39], suggesting that SERPINE1 can act as a chemokine that interacts with other immune cells. Therefore, to confirm whether SERPINE1 can act on other immune cells in the tumor immune microenvironment, CIBERSORT was used to assess the relationship between the expression differences of SERPINE1 and immune cell infiltration. Here, we identified 10 immune cells with the most obvious differences and further analyzed the correlation of the proportion of 11 kinds of tumor-infiltrating immune cell with SERPINE1 expression. Through the intersection of these two groups, we finally identified eight immune cell types that included neutrophils, mast cells, and macrophages, which have been reported in other diseases. Meanwhile, this group also contained T cells CD8, T cells gamma delta, NK cells and dendritic cells.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.