Discussion
Our study constructed a functional network of COAD based on sample-specific network theory. The results showed that the nodes in the functional network which we denoted as functional genes had the potential roles in discriminate tumor samples from normal samples, COAD subtyping and prognosis. And the edges in the functional network which we called functional interactions could be prognosis biomarkers for COAD.
The enrichment analysis for the 1063 functional genes revealed some key biological processes and pathways which could play roles in pathogenesis and progression of cancer (Figure
S6). Specifically, among the top 5 most enriched GO terms (Figure
S6A), rRNA processing as the most enriched one involved in 42 functional genes that were upregulated in COAD compared with normal samples. And upregulation of rRNA processing genes was reported to be connected with CRC, which can overproduce the matured ribosomal structures in CRC [
26]. The next three most enriched GO terms included “negative regulation of transcription from RNA polymerase II promoter”, “positive regulation of transcription from RNA polymerase II promoter” and “positive regulation of transcription, DNA-templated”, play important roles in regulating the process of transcription. The term “G1/S transition of mitotic cell cycle” contained 25 functional genes. And 23 of the 25 functional genes were significantly up-regulated in COAD, such as
CDK1,
CDK2,
CDK4,
CDK6, and
CDK7, which can cause uncontrolled proliferation and may serve as promising targets in cancer therapy [
27]. The top 5 most enriched KEGG pathways were shown in Figure
S6B. Among them, pathways in cancer, proteoglycans in cancer, and cell cycle are correlated with cancers. The deregulation of Hippo signaling pathway was found in CRC and the interaction between Hippo and Wnt signaling play crucial roles in CRC development [
28]. Wnt signaling pathway plays important roles in CRC and could be a potential target of revolutionary therapeutic treatments for CRC [
29]. Therefore, the references confirmed the importance of the 5 most enriched GO and KEGG pathways of the 1063 functional genes.
Furthermore, our results demonstrated that the 1063 functional genes were enriched in the five known cancer gene sets including the curated gene sets in pathway in cancer, colorectal cancer, cancer gene census, pan-caner driver genes, and cancer driver genes, which also implied the important roles of the 1063 functional genes in COAD.
Literature searches were conducted to further investigate the functions of the top 20 functional genes with the highest node degree, which found that 11 genes were related to CRC (Table
S8). In addition, four genes (
CCND1,
WNT2,
MET, and
HDAC2) of the 11 genes were contained by the five known cancer gene sets (Table
S8). Specifically, CyclinD1 (
CCND1) polymorphisms were associated with CRC [
30];
WNT2, a member of the WNT gene family, is involving in a signaling pathway which can promote colorectal cancer progression [
31];
MET (MET Proto-Oncogene) may act as prognosis biomarkers for CRC [
32];
HDAC2 (Histone Deacetylase 2) was found to be a potential target in CRC [
33]. Besides, literature searches found that our method could also identify new biomarkers not contained by the five know cancer gene sets. For example,
UBE2I, the small ubiquitin-like modifier (SUMO) E2 ligase, was reported as a critical factor in sustaining the transformation growth of KRAS mutant colorectal cancer cells, which suggested that
UBE2I could be a drug target for the treatment of KRAS mutant colorectal cancers; LIM Protein
JUB was reported as a novel target for the therapy of metastatic CRC since it is a tumor-promoting gene which can promote Epithelial-mesenchymal transition (EMT) [
34]; ubiquitin-conjugating enzyme E2S (
UBE2S) was reported as a potential target for CRC therapy since it plays an important role in determining malignancy properties of human CRC cells [
35]; the atypical cyclin
CNTD2 which can promote colon cancer cell proliferation and migration, was reported as a new prognostic factor and drug target for CRC [
36]; it has been found that
TRIB3 (Tribbles Pseudokinase 3) may act as prognosis biomarker for CRC [
32];
BOP1 (BOP1 Ribosomal Biogenesis Factor) is responsible for the colorectal tumorigenesis [
37];
GTPBP4 (GTP Binding Protein 4) is involved in the metastasis of CRC [
38]. The results proved that our identified functional genes not only contained the known cancer genes but also included the important genes related to CRC.
Gene expression data can be used to realize the classification between tumor and normal samples, which may suggest targeted therapy options. We carried out a classification of COAD tumor from normal samples using the gene expression data of functional genes. The high prediction accuracy reached by the 185 core functional genes to discriminate tumor from normal samples in both TCGA dataset and independent validation dataset, and it suggested that functional genes were potential diagnostic biomarkers for COAD.
Six subtypes of COAD were detected by using consensus clustering method based on the expression profile of 185 core functional genes, including subtype c1 (
n = 38), subtype c2 (
n = 138), subtype c3 (
n = 99), subtype c4 (
n = 85), subtype c5 (
n = 38) and subtype c6 (
n = 56). For subtype c1, 318 DEGs (161 up-regulated and 157 down-regulated) were associated with subtype c1, enriched in many important pathways such as DNA replication, cell cycle, mismatch repair, and p53 signaling pathway, and so on, which suggested that subtype c1 had abnormal cell cycle process and p53 signaling pathway dysregulation. Besides, subtypes c1 had the characteristic of high frequent copy loss of
TCF7L2 which can promote migration and invasion of human colorectal cancer cells reported by the latest study [
39]. High frequent copy loss of
RPL18P1 was also found in subtype c1, which could also play important roles in subtype c1. Consequently, our founding suggested that we can focus on cell cycle,
p53,
TCF7L2,
RPL18P1 when finding therapeutic drugs for subtype c1. For subtype c2, 120 DEGs (101 up-regulated and 19 down-regulated) were detected as representative genes, which were enriched mTOR signing pathway and MAPK signaling pathway. It is well known that both mTOR signing pathway and MAPK signaling pathway are two of the most implicated cellular pathways in cancers. In addition, Todd M.P. et al. demonstrated that the combination of a PI3K/mTOR and a MAPK inhibitor can enhance anti-proliferative effects against CRC cell lines [
40] and Wang H. et al. reported that targeting mTOR suppresses colon cancer growth [
41], which suggested that mTOR and MAPK could be therapeutic targets for subtype c2. For subtype c3, 139 DEGs (108 up-regulated and 31 down-regulated) were identified as representative genes, which were enriched in spliceosome, antigen processing and presentation, estrogen signaling pathway and mRNA surveillance pathway. The spliceosome pathway was reported as a target for anticancer treatment [
42] and displayed phase-shifted circadian expression in CRC [
43]. Downregulated antigen processing and presentation were reported in CRC [
44]. Estrogen signaling pathway was reported as a target for colorectal cancer [
45]. mRNA surveillance pathway is to detect and degrade abnormal mRNAs. Nonsense-mediated mRNA decay (NMD) as one of mRNA surveillance pathway has been reported as a target for colorectal cancers with microsatellite instability [
46]. Besides, many patients in subtype c3 showed msi-h feature and had high frequent
OBSCN,
MYCBP2,
RYR2 and
TTN mutations. It was reported that msi-h could be a potential prognostic and therapeutic factor for COAD [
47], which suggested that msi-h could play important roles for the patients in subtype c3 with msi-h.
OBSCN,
RYR2 and
TTN mutations which have been reported as drivers [
48] could be biomarkers for subtype c3. And more,
MYCBP2 was reported as a potential therapeutic target for CRC [
49], which could offer treatment suggestions for subtype c3. Therefore, for patients in subtype c3, spliceosome, antigen processing and presentation, estrogen signaling pathway, NMD, msi-h,
OBSCN,
MYCBP2,
RYR2, and
TTN could be the potential therapeutic targets. For subtype c4, 82 DEGs (6 up-regulated and 76 down-regulated) were found as representative genes that were enriched in viral carcinogenesis and spliceosome, and so on. Viral carcinogenesis is a factor to induce DNA damage and virus integration [
50] and may be involved in the etiology of CRC [
51]. Hence, viral carcinogenesis and spliceosome could be the potential targets for subtype c4. For subtype c5, 181 DEGs (51 up-regulated and 130 down-regulated) were detected and were enriched in Rig-I-like receptor signaling pathway, FoxO signaling pathway, autophagy-animal, insulin signaling pathway, toxoplasmosis, and focal adhesion, and so on. Among the enriched pathways, RIG-I-like receptor signaling plays important roles in colon cancer [
52]; FoxO signaling pathway has been reported as therapeutic targets in cancer [
53]; autophagy was reported as a promising target for CRC [
54]; insulin signaling pathway could be a potential CRC therapy [
55]. In consequence, these pathways could be the targets for subtype c5. For subtype c6, 163 DEGs (22 up-regulated and 141 down-regulated) were identified as representative genes and were enriched in glycosaminoglycan biosynthesis-heparan sulfate, tight junction, circadian rhythm, ECM-receptor interaction and so on. Glycosaminoglycans have therapeutic value in cancer [
56]; tight junction whose protein claudin-2 has been reported as a potential target for CRC therapy [
57]; circadian rhythm plays roles in the pathogenesis of CRC [
58]; ECM-receptor interaction may play a critical role in CRC metastasis [
59]. In addition, copy loss of
ARHGEF28,
BIN2P2, and
SLC25A5P9 were frequent in subtype c6, which suggested that they may be the potential biomarkers. Therefore, glycosaminoglycans, protein Claudin-2, circadian rhythm, ECM-receptor interaction,
ARHGEF28,
BIN2P2, and
SLC25A5P9 could provide information for the treatment of subtype c6. Taken together, these findings suggested that distinct subtypes of COAD could be treated with specific targeted therapies (Table
1).
Among the 12 functional genes which were associated with the prognosis of COAD, high expression of
TPM2,
STMN2,
CHMP4C,
DUSP14, and
GRIA3 had poorer survival rates, while low expression of
WDR1,
CPT2,
KDM1A,
NFE2L1,
TBL3,
TGFBR3, and
FGFR2 had worse survival rates. Some of the 12 functional genes have been connected with COAD or other diseases according to the existing research. For example,
TPM2 was reported to be in implicated in CRC [
60];
STMN2 might be involved in beta-catenin/TCF-mediated carcinogenesis in human hepatoma cells [
61];
CHMP4C was identified as a novel molecular target gene for ovarian cancer [
62];
GRIA3 may act as a mediator of tumor progression in pancreatic cancer [
63];
WDR1 was reported as a therapeutic target in lung cancer [
64];
CPT2 was identified as a potential diagnostic biomarker of colon cancer [
65]; Somatic deletion of
KDM1A plays role in advanced colorectal cancer stages [
66];
NFE2L1, also called Nrf1, was found to be associated to high-risk diffuse large B cell lymphoma [
67]; Gatza et al. reported that
TGFBR3 promotes colon cancer progression [
68];
FGFR2 was shown to promote gastric cancer progression [
69]. Therefore, the 12 functional genes probably play important roles in COAD and could be the potential prognosis biomarkers for COAD. There was no obvious correlation between the expression of 12 genes (Figure
S7). To find the best combination of them, we performed LASSO Cox regression on the 12 functional genes to select the most informative gene set for prognosis (Figure
S8). Eventually, seven functional genes (
CHMP4C,
WDR1,
CPT2,
DUSP14,
NFE2L1,
TBL3, and
TGFBR3) were selected as the most informative gene set for prognosis. The
p-value was 3.00 × 10
− 4 for the best model with the seven genes in cox analysis which was better than only use one gene model.
The 13 functional interactions which could be potential prognosis biomarkers provides a new suggestion for cancer prognosis. And LASSO Cox regression was also performed for the 13 functional edges, resulting in seven functional interactions (
ESR1_
E2F1,
ARRDC4_
HECTD3,
SPTBN2_
SPTAN1,
SOX9_
UBE2I,
CBX8_
HOXA9,
PPM1G_
STMN2,
E2F1_
KDM1A) were selected as the most informative edge set for prognosis with
p-value = 4.00 × 10
− 6. It is worth pointing out that Narayanan S.P. et al. found that
KDM1A plays a role in cell proliferation through regulating the E2F1 signaling pathway in oral cancer [
70] and
CBX8 interaction with
HOXA9 was found to play an important role in MLL-AF9-Induced Leukemogenesis [
71], which suggested that they may also play important roles in COAD.
The main limitations of the study are: the biomarkers and subtypes detected in this study need to be proved with more external datasets and biological experiments; the roles of the functional network as a whole need to be further explained.