Background
Acute myeloid leukemia (AML) is the most common acute leukemia in adult patients and is an aggressive haematological malignancy. It is a haematological malignancy that is promoted by various factors, including environmental exposure and several abnormality in cellular and molecular level [
1]. Despite decades of effort and achievements, the detailed mechanism of AML remains unclear. Studies have demonstrated that several biological factors may be involved in the pathogenesis and clinical outcomes of AML. As determining patient prognosis is important for individualized treatment, it is essential to comprehensively investigate biological factors that may be associated with AML outcomes and to select valuable biomarkers for prognosis prediction.
MicroRNAs (miRNAs), small noncoding RNA species approximately 22 nucleotides in length, have been demonstrated to play a predominant role in translational and post-transcriptional regulation. miRNAs participate in a wide range of biological processes, and their expression levels are frequently abnormal in human cancers [
2]. MiRNAs are involved in the pathogenesis of AML and influence the differentiation, proliferation of haematopoietic cells and clinical treatment response [
3‐
5]. Expression levels of miRNA have been identified as potential biomarkers for predicting prognosis in various human cancers, including AML [
6‐
9]. However, most studies have focused on one or few miRNAs, while synthesizing several miRNA signatures to comprehensively predict AML outcome has yet to be done.
In most cases of AML, patients at the same clinical stage may experience completely different clinical outcomes. Therefore, there is a critical need to explore novel biomarkers that facilitate the clinical prediction of AML prognosis. In 2013, The Cancer Genome Atlas (TCGA) Research Network reported an analysis of the genomes of 200 AML cases, along with RNA and microRNA sequencing data. They found that miRNA signatures were associated with unfavourable, intermediate, and favourable cytogenetic risk categories [
10]. Herein, we performed comprehensive analysis using data from TCGA, which contains200 AML patients and expression data for705 miRNAs. We sought to identify potential molecular signatures to predict non-M3 AML prognosis. In addition, bioinformatics analyses were performed to evaluate functions of the identified miRNAs. Our results aid in further characterizing mechanisms of non-M3 AML pathogenesis.
Methods
AML dataset in the TCGA database
MiRNA-seq data for 705 miRNAs and clinical data for AML patients were downloaded from the TCGA data portal (
https://portal.gdc.cancer.gov/), an interactive data system for researchers to query, download, upload, and analyse harmonized cancer genomic data sets. Since M3 subtype has different biologic features and favourable outcome from other AML subtypes, we only included 179 non-M3 patients in the present analysis. Patients were treated in accordance with National Comprehensive Cancer Network (NCCN) guidelines (
www.nccn.org). Many unfavourable risk and intermediate risk patients underwent allogeneic transplantation at some point in their disease course. RNA was extracted from whole blood, and patient samples were collected prior to treatment. Patients without miRNA expression data or clinical information were excluded. As a result, there were 147 non-M3 AML patients included in our study. Clinical demographics (e.g., gender, age, risk category) and survival data (overall and event-free survival) for these patients were also obtained from the TCGA dataset.
Multivariate survival analysis
We used clinical information, including age, gender, white blood cell (WBC) count to build a multivariate Cox proportional hazard ratio model. Clinical parameters were selected as significant using 0.05 as the cut off. miRNA with missing expression data > 75% were deleted for the following analyses. A multivariate Cox model was used to evaluate the association between expression level of each miRNA and prognosis of a patient. Subsequently, false discovery rate (FDR) adjustment was conducted, and significant miRNAs were filtered out with a cut off of 0.05. As a result, we screened out three miRNAs that were significantly related to survival time in AML as potential biomarkers of prognosis. Subsequently, One-way ANOVA test was performed to investigate whether there was significant difference of miRNA expression level among different sub-groups of AML.
Analyses of PCA and the association between miRNA expression and AML survival
To determine whether selected miRNAs were reliable indices for clinical prediction of AML prognosis, we conducted principal component analysis (PCA) to obtain a value for each patient based on the expression level of these miRNAs. In addition, the median of all the PCA values was used as a cut off to separate patients into one of two groups. We performed the log-rank test to investigate whether there was a significant difference between the survival of the two groups, and we used Kaplan-Meier plots to visualize survival curves.
Hierarchical clustering analysis
Patients were divided into two groups according to their survival status (i.e., non-censored patients were recognized as the poor-outcome group, while censored patients denoted the good-outcome group). We performed unsupervised hierarchical clustering analysis using the Cluster/Tree View program to give a general overview of the expression of the selected miRNAs in the good-outcome and poor-outcome groups. A heat map was created with expression levels of the three miRNAs and grouping information from AML patients to visualize the relationship between miRNA expression level and overall survival (OS).
Integrated gene expression
We also downloaded RNA-sequencing data for AML patients from the TCGA database. A total of 135 cases with both coding gene expression levels and miRNA expression data were included in subsequent correlation analyses. We performed a Pearson correlation for each miRNA and all the coding genes and then selected the top 100 coding genes with the highest Pearson correlation coefficient. The top 100 genes were considered potential targets of the miRNAs because their expression levels were most correlated with those of the miRNAs.
Gene ontology (GO) analysis
We used the online software DAVID (Database for Annotation, Visualization and Integrated Discovery,
http://david.abcc.ncifcrf.gov/home.jsp) to analyse the molecular functions and biological pathways of the coding genes most correlated with the selected miRNAs (top100 of Pearson correlation coefficient). Names of the correlated coding genes were input into DAVID, and the results showed an annotation of input genes utilizing Gene Ontology (GO) terms (
http://www.geneontology.org/).
Discussion
In past decades, numerous studies have investigated biomarkers for the prognosis of cancers. Among various kinds of biomarkers, miRNAs have attracted increasing attention due to their stability in tumour tissue and the ease with which they can be detected in peripheral blood [
11‐
13]. The pathogenesis and prognosis of many cancers has implicated the deregulation of miRNAs, which play vital roles in regulating the expression of coding genes involved in a wide range of biological processes [
14,
15]. Therefore, several studies have investigated the clinical value of miRNAs for the prediction of cancer outcomes [
9,
16,
17]. Koolivand et al. observed that miR-155 might function as an oncomir in AML and can be a prognosis biomarker for AML [
18], while Xu H et al. have reported role for miR-135a in predicting poor prognosis in AML [
19]. We inferred from these studies that miRNAs can be valuable biomarkers for the prediction of AML prognosis. However, most current studies have focused on single miRNAs, while systematic analyses of association between whole miRNA expression profiles and AML outcomes remain relatively rare.
In the present study, we obtained data for miRNAs and coding gene expression microarrays in AML patients from the TCGA database. Clinical demographics, including age, gender, risk category, FAB morphology category, as well as OS time and event status, were also downloaded. Analyses showed that patient age and risk category were significantly related to survival status. Previous studies have demonstrated that elderly patients have poorer AML outcomes, partly due to their poorer tolerance to chemotherapy treatment and their reluctance to accept intensive regimens [
20,
21]. Our results also showed that the percentage of living patients was remarkably lower in the older group than in the younger group. In addition, we found that the frequency distribution of living patients was significantly different among groups of AML risk categories, which were mainly distinguished by genetic abnormalities [
22]. Our results are in accordance with the current consensus that risk stratification of AML is an important prognostic factor [
23,
24].
Subsequently, three miRNAs were identified to be significantly associated with AML survival. We subsequently performed PCA analyses to extract data for the expression levels of the three miRNAs, representing them as one PCA value for one each patient. Subsequently, analysis suggested that PCA values can separate AML into two groups with significantly different prognoses. Compared with most studies that identified single miRNAs associated with AML outcome [
5,
25], our results are more comprehensive and illustrate the clinical value of a panel of three miRNAs that predict AML survival. In further studies, a scoring system will be constructed from expression level of these three miRNAs to predict prognosis of AML in a quantitative manner.
Recently, Chuang MK identified a 3-microRNA scoring system for prognostication of AML [
6], and their list of miRNA signatures (i.e., miR-9, miR-155, miR-203) was different from ours, which were derived from the TCGA study [
10]. We speculated this difference was partly due to different populations (i.e., Chuang’s study was conducted in a Taiwanese population). Furthermore, differential analysis methods could also be a reason for the discrepancy.
In addition to this, Gao HY et al. have reported another work identifying some miRNAs associated with AML prognosis, using TCGA data. We have compared our present result with Gao HY’s report [
26] and found that only miR-362was in common. However, patients of all AML subtypes were included in Gao’s study, while only non-M3 patients were included in ours because AML-M3 patients have remarkable different biologic features and good outcome from other AML subtypes. Besides, in Gao’ study, they drawn KM curve for each miRNA and miRNAs exhibiting significant differences were selected as candidate factors, while we performed multivariate Cox’s analysis to identify potential miRNAs associated with AML survival. Therefore, we inferred that different study populations as well as different analyse method may be the reason for differences between Gao’s study and ours. On the other hand, we have noticed that miR-362 was identified in both our and Gao’s study, with the same variation trend (i.e. higher expression level predicted poorer prognosis). After a literature reviewing, we found Jin J et al. have also reported high miR-362 expression is associated with poorer OS in AML [
27]. There observations provided strong evidence for the outcome predicting value of miR-362 in AML.
With respect to the other miRNAs identified to be valuable predictors of AML outcome, their biological role in cancer prognosis was also investigated through a literature review. As for miR-181a-2, a multiple centre study have demonstrated that higher miR-181a expression was significantly associated with better outcome of AML patients [
28]. In xenograft mouse models, ectopic miR-181a expression have also shown to inhibits tumor growth [
29]. This miRNA also plays determinant role in the sensitization of leukemic resistant cells to DNR and NK cell [
30]. Besides, miR-181a has shown to sensitize resistant leukemia HL-60/Ara-C cells to Ara-C by inducing apoptosis [
31]. While for miR-25, one study conducted by Wang Y et al. has reported that it was significantly associated with overall survival of 53 AML patients and its expression was associated with a good outcome [
32]. These studies provide further evidence for biological role of miR-181a-2 and miR-25 in AML progression.
Subsequently, bioinformatic analyses were performed to further reveal potential molecular functions and biological pathways for these three miRNAs. Firstly, we tested the reliability of relationship between miRNAs and correlated coding genes by conducting bioinformatics analyses using Targetscan, and further biological experiment will be conducted to confirm the association between other miRNAs and coding genes correlated with them. Then GO analysis revealed molecular activities of the genes most correlated with each miRNA. We found the correlated coding genes for the three miRNAs were relatively concentrated in RNA binding, chromatin binding and protein binding. Therefore, we can infer the biological function of the selected three miRNAs more rationally. In addition, pathway analyses also revealed that correlated coding gene expression for the identified miRNAs participated in transcriptional misregulation in cancer. Some of the potentially related pathways identified were interesting, such as mineral absorption. Previous studies have already shown that abnormalities in mineral homeostasis are involved with acute leukemia [
33], but a detailed mechanism for this has not been demonstrated. It should be also noted that correlated genes of miR-181a-2 and miR-25 enriched in the same pathways, indicating a similar mechanism of the two miRNAs in pathogenesis of AML. Recently, Chang H et al. identified robust survival sub-pathways in AML using sample-matched miRNA and gene expression data sets from the TCGA database. Our results are different from their study partly because of the divergent study design, and the pathways we revealed may be a valuable clue for AML exploration. The results of the bioinformatics analyses provided us potential clues for further investigation of the biological function of the three identified miRNAs.
Although our present study have identified potential target genes and related pathways of these 3 miRNAs by bioinformatic approaches, in-depth mechanical approaches are still needed to verify our bioinformatic result. This is a limitation of our study. However, some of the top correlated coding genes have already been reported to be mutated in AML (e.g. CCND2, STMN1 and TFEB), which provided indirect evidence for our prediction that the selected 3 miRNAs may participate in AML by targeting these “top genes” [
34‐
36]. In addition, the case numbers of TCGA data downloaded were too small to be split into training and validation cohorts. This is a limitation of our present study. In the future, we will collect more AML samples to create a validation cohort for our results.