To further narrow the genes screened, we used data from GEO, TCGA, HPA, and KEGG to perform a systems-level analysis of 71 mutated genes (Fig.
1c), aimed to identify novel markers that can affect CRC initiation, progression and patients’ survival. Our hypothesis is normal tissue-enriched genes may be tissue-specific, and loss of which could promote CRC tumorigenesis. In order to find colorectal-enriched genes that may be colorectal tissue-specific, we analyzed gene expression level in normal tissues using HPA and TCGA database. The gene expression in normal colon or rectum ranked top 7 in 37 tissues was identified as colorectal-enriched in this study. 11 genes are not detected in colon or rectum, while
BMP5,
REP15,
ATP8B1,
ELF3, and
RASSF6 are relative high in colorectal tissue, and significantly downregulated in colon or rectum adenocarcinomas (Additional file
1: Table S8). We further analyzed differentially expressed genes in early events of CRC (normal tissue – adenoma or polyps) using GEO datasets (GSE8671, GSE71187, and GSE41258). We found 13 genes were differentially expressed in at least 2 series, of which 7 were downregulated (
BMP5, PDE2A, ZNF175, CTSA, SVEP1, ATP6V0D2, and
AHNAK) (Additional file
1: Table S9). Kaplan Meier survival analysis identified 33 genes may affect patients’ survival, including two favourable prognostic markers (
REP15,
ATP8B1) and one unfavourable prognostic marker (
GPSM1) reported in HPA database [
11]. In addition, high expression of
BMP5 and
RASSF6 were correlated with a longer patients’ survival outcome (Additional file
1: Table S10). To find disease-related genes, we further used KEGG database together with literature queries, 19 genes were involved in pathway networks, including
APC, TCF7L2 (Wnt signaling),
BMP5 (TGFβ signaling), and
RASSF6 (Hippo signaling) (Additional file
1: Table S11). Taken together,
BMP5 satisfied all criteria we tested in multi-omics. We confirmed
BMP5, a novel gene that has never been investigated in sCRC, was top rank gene in our study. From the cBioPotal database, we found 30.4% of
BMP5 mutated samples without
APC mutation, and still 17.4% of
BMP5 mutated samples without
APC, KRAS, or TP53 mutation. As is well known, most sCRCs occur through chromosomal instability pathway, which is characterized by
APC mutations. This result may indicate that in addition to well-studied diver genes, the alteration of
BMP5 may play a role in the oncogenesis of sCRC.
Subsequent Sanger sequencing analysis in expanded individuals showed
BMP5 was mutated in 7.7% of patients and 37.5% of these mutations were LoF. The distribution of 8 mutations identified in BMP5 is shown in Fig.
1d, Table
1, and Additional file
2: Figure S2. Notably, we also found a missense mutation p. D183G in In-2 patient. All missense mutations found were possibly damaging and pathogenic analyzed by PolyPhen-2 and FATHMM-MKL algorithm [
12,
13]. The affected residues of BMP5 are highly conserved evolutionarily (Additional file
2: Figure S3), thus these mutations are rare in normal but of high penetrance in sCRC. Examination of publicly available databases revealed that
BMP5 mutation is also found in several other tumor types (Fig.
1e). Truncating mutation frequency of BMP5 is highest in CRC, while copy number amplification could be found in all tumors but not in CRC. These results showed the characteristic of BMP5 alteration is different from that of other type of tumors.
Table 1
BMP5 somatic mutations identified in exome sequencing and expanded deep sequencing cases
In-1 | NONSENSE | 2 | Chr6:55684616 | C > T | p. R174* |
In-2 | MISSENSE | 2 | Chr6:55684588 | A > G | p. D183G |
Ex-50 | NONSENSE | 2 | Chr6:55684601 | G > T | p. E179* |
Ex-36 | MISSENSE | 2 | Chr6:55684504 | A > T | p. N211I |
Ex-92 | MISSENSE | 2 | Chr6: 55684467 | G > T | p. K223D |
Ex-93 | NONSENSE | 4 | Chr6:55638913 | C > T | p. R321* |
Ex-70 | MISSENSE | 4 | Chr6:55639009 | G > A | p. V289 M |
Ex-2 | SYNONYMOUS | 6 | Chr6:55623866 | A > G | p. G384G |