Background
Bladder cancer (BCA) is the ninth most common cancer in the world [
1], and it is estimated that the number of new BCA cases in the United States will reach 81,400 in 2020, with approximately 17,980 deaths [
2]. Notably, the clinical course of BCA varies with the degree of tumor invasion. In patients with BCA, 70% of cases are non-muscular invasive BCA (NMIBC), characterized by a high recurrence rate and low mortality. The remaining 30% are muscular invasive BCA (MIBC), which is prone to early metastasis, and approximately half of these cases are fatal [
3,
4]. Cystoscopy is the current gold standard to monitor BCA. However, this method involves a highly invasive examination and has a sensitivity of 85% for the diagnosis of exophytic tumors [
5]. It is worth noting that the postoperative pathology of transurethral bladder tumors does not always accurately reflect the tumor stage, due to limitations such as the experience of the operator and destruction of tissue caused by energy instruments. Studies have shown that the proportion of transurethral surgical procedures that underestimate tumor stage could be as high as 25% [
6,
7]. Further, repeated cystoscopy is used for continuous follow-up, which makes BCA one of the most expensive malignant tumors to treat [
5,
8]. At the same time, in the absence of a new targeted therapy-based strategy, despite surgery and cisplatin-based chemotherapy, it is clear that the existing treatment modality has reached its peak, with only a slight improvement in patient survival but more side effects [
9]. As the survival time of patients with NMIBC progression to MIBC is poor [
10], it is very important to better understand the biological pathogenesis of BCA, to more accurately monitor its progression, predict patient prognosis, and choose appropriate treatments.
Therefore, the existing diagnosis, prognosis, and monitoring systems for BCA patients need to be improved. Epigenetics comprise heritable but reversible modifications that can alter gene expression without changing the original DNA sequence. The maintenance of epigenome function is the basis of normal gene expression. Changes in epigenome function will affect the basic processes of cell proliferation, differentiation, and death, which might lead to cancer [
11]. Therefore, the diagnosis, the evaluation of prognosis and the prediction of response to treatment can prospectively count on cancer biomarkers based on epigenetics [
12]. DNA methylation was the first epigenetic modification found in humans in the early 1980s [
13]. Among the epigenetic mechanisms, DNA methylation is the best studied, and abnormal CpG island methylation was proven to be related to the occurrence and development of many cancer types, including BCA [
14]. At the same time, several studies also support this argument. Specifically, 90% sensitivity and 93% specificity were observed when evaluating the methylation of
Twist1 and
NID2 [
15]. The methylation status of
SOX-
1,
IRAK3, and
LI-
MET genes also showed better ability to predict progression than cystoscopy [
16]. Uromark is described as a novel next-generation sequencing-based biomarker, based on 150 CpGs, with a sensitivity of 98% and a specificity of 97% for monitoring BCA [
17]. Further, in terms of prognosis, hypermethylated
TIG1, GSTP1 and
APC in BCA patients were found to be associated with poor survival outcomes, showing 93% specificity and 80% sensitivity [
18].
However, it is recognized that the specific methylated sequence of the gene promoter region has not been identified yet. Therefore, the objective of this study was to identify DNA methylation profiles in BCA from the TCGA database and to identify biologically and clinically relevant molecular subsets. Our classification scheme could help to identify new BCA molecular subtypes and prognostic model based on methylation site to accurately subdivide BCA patients and improve clinical prognostic assessments and personalized treatment.
Methods
Data selection and pre-processing
For this study, Samples from the TCGA database containing 437 BCA methylation data are downloaded from UCSC Cancer Browser (
http://xena.ucsc.edu/,2020-02-23). RNA-sequencing data from 433 BCA samples were downloaded from TCGA (
https://cancergenome.nih.gov/, 2020-02-23), and among these, 407 samples were associated with clinical data. CpGs with missing data in more than 70% of the samples were excluded from the analysis. Based on the polymorphic CpGs and cross-reaction probe, the CpGs of the cross-reaction probe in the genome was also removed. The k-nearest neighbor imputation method in SVA R software package was used to estimate other unrecognized probes [
19]. We also removed unstable genomic sites with CpGs and single nucleotide polymorphisms in sex chromosomes. We only studied the CpGs in the promoter region because DNA methylation in the promoter region (2 kb upstream of the transcriptional initiation site to 0.5 kb downstream) significantly affects gene expression. Finally, the BCA samples were randomly grouped as 203 training samples and 204 testing samples. The TNM staging system is based on the extent of the tumor (T), the extent of spread to the lymph nodes (N), and the presence of metastasis (M). For bladder cancer, T describes how far the main tumor has grown through the bladder wall and whether it has grown into nearby tissues. N indicates any cancer spread to lymph nodes near the bladder. M indicates if the cancer has spread to distant sites. Once a person’s T, N, and M categories have been determined, usually after surgery, this information is combined in a process called stage grouping to assign an overall stage. Stage grouping range from stages I through IV. Stage I to stage IV represents increasing malignant degree of bladder cancer, the earliest stage cancers are called stage I, and stage IV means advanced cancers.
Using Cox proportional hazard regression model to determine CpGs of prognosis
First, a univariate Cox proportional hazard regression model was established based on CpGs, T stage, N stage, M stage, grade, age, sex, stage, and survival data (P < 0.001). The significant CpGs obtained from the univariate Cox proportional hazard regression model were used to analyze the multivariate Cox proportional hazard regression mode (P < 0.001). Finally, CpGs that were significantly modulated in multivariate Cox regression analyses were selected as characteristic CpGs.
Establishment of molecular subtypes using consensus clustering
Based on the most variable CpG sites, the K-means clustering algorithm in ConcensusClusterPlus R packet [
20] was used to perform consistent clustering to identification BCA subgroup. The pre-specified dataset was classified into k clusters via the algorithm. Here, 80% of the tumor samples were selected in each iteration, and the K-means algorithm and Euclidean square distance measure served as the grouping of Kappa 2 and 20. We used more than 100 iterations to determine the stability of each cluster. At the same time, the maximum number of clusters which have at least 90% cluster consensus was selected.
The optimal number of clusters was defined based on the delta area map and the cumulative distribution function. Color gradients were used to describe common values from 0 to 1 (white to dark blue). Items belonging to the same cluster in the matrix should be adjacent to each other. Therefore, the diagonal blue blocks on the white background in the color-coded heat map correspond to the matrix of perfect consensus. The best cluster number need to meet the following requirements: high consistency, small coefficient of variation, and the area under the cumulative distribution function (CDF) curve not increasing remarkably. The number of categories was defined by the relatively insignificant change in the area under the CDF curve. The heat map corresponding to consensus clustering was generated with the pheatmap R packet.
Survival and clinical characteristics analyses
Construction of the overall survival (OS) curve of BCA subsets defined by the DNA methylation profile was performed via the Kaplan–Meier method. Significant differences among clusters were evaluated by the log-rank test. A comprehensive analysis of the relationship between clinical information and biological characteristics of clusters based on DNA methylation was performed through a Chi square test. R Bioconductor survival package was used for survival analysis. P < 0.05 was considered statistically significant.
Gene Ontology enrichment and KEGG pathway analysis
We use the clusterProfiler software package in R [
21] for Gene Ontology (GO) enrichment and KEGG analysis to obtain GO terms (biological processes, molecular function, and cellular component) and KEGG pathways. Significance was defined as P < 0.05.
Establishment and verification of the prognostic prediction model
According to the coxph function of the survival package in R, combined with the methylation characteristics and prognostic information of the three CpGs, a Cox proportional hazard model was constructed. For this model, we use the formula as: risk score = 2.4139 × cg18312429 + 2.0075 × cg19826026 + 2.4836 × cg27562023. The ROC packet in R was used to establish a receiver operation characteristic (ROC) curve. To verify the reliability as well as stability of this model, 204 testing samples were analyzed using this prediction model.
Generation and validation of a predictive nomogram
In this study, univariate and multivariate Cox regression analyses were performed to determine independent prognostic factors to construct the nomogram. All independent prognostic factors obtained by multivariate COX regression analysis were selected to construct a combined prognostic model to evaluate the probability of 1-year, 3-year, and 5-year OS in patients with BCA. To evaluate the consistency between the actual survival rate and the predicted survival rate, a calibration curve can be drawn in which the 45° line represents the best prediction and the coincidence represents the better predicted value of the model.
Discussion
The classical view of cancer evolution is that a series of genetic changes promotes the transition from the early precancerous stage to invasive cancer and affects the incidence of metastatic diseases. During carcinogenesis, oncogenes can be activated to promote cell division or inhibit cell death. At the same time, tumor suppressor genes can be inactivated in a way that can promote abnormal cell proliferation. Therefore, both the functional gain of proto-oncogene mutations and loss-of-function mutations in tumor suppressor genes might cause cancer through uncontrolled cell growth and defective apoptosis [
22]. DNA methylation involves the addition of a methyl group at the cytosine 5′ carbon position of CpG dinucleotides in the genome, which is an important element of epigenetic regulation of gene expression [
23]. Since the 1990s, increasing number of studies have recognized that heritable changes regulated by epigenetics might also play a vital role in the evolution of all types of human cancer [
24]. We now know that epigenetic changes occur through specific events, including early widespread loss of normal DNA methylation and an increased number of focal gains in gene promoters [
11].
BCA is a malignant tumor of the bladder that originates from the transformation of transitional intraepithelial urothelial cells, and is therefore also known as urothelial carcinoma or transitional cell carcinoma. Although transitional cell carcinoma of the bladder ranks fourth among men [
2], the mechanism of the occurrence and development of urothelial carcinoma is still not completely clear. The development of modern techniques for genome-wide DNA methylation detection enables a more in-depth analysis of BCA methylation. Wolff et al. confirmed that the majority of DNA methylation changes occured in the early stage of BCA which were conserved in carcinoma in situ, non-invasive tumors and invasive tumors, and were located on the CpG island [
25]. Compared to the urothelium from a healthy bladder, the hypermethylation of
ZO2,
MYOD, and
CDH13 were also detected in the urothelium with a normal appearance in patients with BCA, suggesting that epigenetic ‘field defects’ might be one of the reasons for the loss of epithelial integrity. Changes in DNA methylation comprise an early driver of cancer, and epigenetic changes involving DNA methylation might result in subsequent genome changes, which create a permissible environment for the onset and recurrence of BCA [
25,
26]. In an interesting study, the gene methylation pattern of secondary bladder recurrence of primary upper urinary tract cancer was tested and it was confirmed that the methylation rate of some genes increased with the increase in the number of recurrences, which might be a predictor of postoperative recurrence [
27]. Further, the methylation status of
GP5 and
ZSCAN12 can effectively be used to distinguish between high-grade and low-grade BCA [
28]. At the same time, the methylation level of genes can effectively identify the degree of invasion of bladder tumors [
25]. It is worth noting that based on TCGA data, the level of methylation and expression of
SOWAHC is associated with prognosis [
29].
HOXA9 promoter methylation has also been shown to be associated with an increase in recurrence and progression in NMIBC. Importantly, it was also proved to be related to cisplatin resistance in BCA cells [
30,
31]. In distinguishing whether patients with BCA have lymph node metastasis, a three-gene methylation panel was shown to predict the progression of metastasis and allow patients to benefit from lymphadenectomy and neoadjuvant chemotherapy [
32].
In the enrichment analysis, the three most significant pathways identified were peroxisome, base excision repair, and fatty acid biosynthesis pathways. At present, it is believed that the peroxisome pathway mainly inhibits tumor proliferation, metastasis, and invasion by activating the expression of
PTEN, c-
myc, and
p27 [
33,
34]. The peroxisome pathway has also been proved to be closely related to the occurrence and development of bladder cancer, and its expression is significantly increased in bladder cancer; moreover, its expression is higher in high-grade and invasive bladder cancer than in low-grade and superficial tumors [
35]. Base excision repair plays a key role in maintaining genome stability, integrity, and preventing carcinogenesis, and DNA destruction may lead to gene rearrangement, translocation, amplification, and deletion [
36]. Hence, defects in these genes may lead to higher susceptibility to multiple cancers [
37]. Notably, a study involving 801 bladder cancer patients and 801 matched controls found that genetic variations in the BER pathway gene regulate the risk of bladder cancer [
38]. Enrichment analysis showed that the genes related to methylation sites were highly related to the biological metabolism of fatty acids. Previous studies have shown that fatty acid metabolism plays an important role in maintaining the growth, migration, and invasion of bladder tumor cells [
39]. Related metabonomic analysis and studies show that regulating fatty acid metabolism has a broad application prospect in the treatment and diagnosis of BCA [
40,
41].
We used CpG sites to identify seven different prognostic subtypes of BCA, which could predict survival of the disease, as well as the TNM classification, grade, stage, and age distribution of prognosis among the seven molecular subtypes. Thus, this classification method results in molecular stratification that is suitable for a single tumor, which has an important impact on treatment decisions and accurate diagnoses. According to the seven molecular subtypes, if the important CpG sites were classified into category 7, the patients were found to have poor staging (more inclined to Stage IV), higher grade, a higher probability of lymph node metastasis, and poor prognosis. This is of great significance for early intervention and to actively encourage patients to receive treatment. Therefore, the radical resection of BCA and early lymph node dissection can be performed more actively. If the sequence of the CpG sites are classified into category 5, the risk of tumor metastasis is higher. Therefore, radiotherapy and chemotherapy should be actively performed to reduce the incidence of metastasis. However, when the sequence of the CpGs are classified as category 2, the tumor has lower invasiveness and the patient has a higher 5-year survival rate, indicating a better prognosis. This classification can motivate doctors to reconsider the individualized treatment of patients, conduct close clinical follow-up, and minimize overtreatment, which would help to reduce pain for the patients. In summary, our study of these seven subtypes at the DNA methylation molecular level indicated that this system could be used to more accurately classify BCA and guide clinicians in the diagnosis, treatment, and prognosis of different epigenetic subtypes.
With the accumulation of knowledges on the biology, function, and regulatory mechanisms of epigenetic modifications in cancer, and considering the limited efficacy of chemotherapy and immunotherapy, clinical success has ushered in an era of epigenetic therapy aimed at reactivating genes that are improperly silenced during carcinogenesis [
42]. One of the first identified histone lysine methyltransferase inhibitors specific for G9a (EHMT2) is BIX-01294, which was considered to inhibit the proliferation of BCA cell lines [
43]. In recent years, CM272 has been considered a novel double inhibitor of G9a/DNMT1, and it has significant antitumor effects on BCA in vivo and in vitro [
44]. Demethylation drugs can inhibit the proliferation, migration, and invasion of BCA cells and have been shown to enhance the sensitivity of cisplatin-resistant cells [
31,
45]. Using these concepts, DNA demethylation reagents are expected to be used in new therapeutic applications and might play an important role in cancer treatment. Therefore, the cancer epigenome as a target for cancer treatment still represents a vital opportunity for clinical exploitation, and exciting progress is expected in the next few years.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.