Background
Lung adenocarcinoma (LUAD), is the most common histological subtype of non-small cell lung cancer (NSCLC) in females (smokers or non-smokers), and in non-smoking males. The incidence of LUAD has increased markedly over the past few decades in many countries, including China [
1,
2]. Most adenocarcinomas first occur in the outer region of the lungs with a tendency to spread to the lymph nodes and beyond. Despite advances in diagnosis and treatment, lung cancer mortality has increased. Mortality rates are amongst the highest of any cancer type.
Following advances in genomics, proteomics and molecular pathology, many candidate biomarkers with potential clinical value have been identified [
3]. Further development of genomic biomarkers is expected to improve patient stratification and lead to more personalized treatment. MicroRNAs (miRNAs, miRs) are small, non-coding RNAs of 18–25 nucleotides, and are thought to regulate gene expression post-transcriptionally by causing mRNA degradation and/or repressing mRNA translation [
4]. MiRNAs are frequently dysregulated in cancer, and may function as both oncogenes and tumor suppressors [
4,
5]. Several prognostic and predictive miRNA markers have been identified for NSCLC [
6‐
11]. However, owing to the small datasets used, the heterogeneous nature of the disease and pre-selection of miRNAs and variations in the approaches for data pre-processing, there are inconsistencies in these sets of miRNA markers.
The purpose of this study is to identify specific miRNA markers closely associated with the survival of LUAD patients from a large dataset of significantly altered miRNAs, and to assess the prognostic value of this miRNA expression profile for OS in patients with LUAD.
Discussion
In this study, we identified 16 miRNAs correlated with OS of LUAD patients in different clinical classes, from the 111 most significantly altered miRNAs in LUAD tissues compared with normal lung tissues. A linear combination of eight miRNAs (miR-31, miR-196b, miR-766, miR-519a-1, miR-375, miR-187, miR-331 and miR-101-1) was validated as an independent predictor for LUAD patient survival. This signature demonstrated significant prognostic performance in both the entire LUAD cohort and the early stage subgroup, particularly in the non-smoking or reformed smoker (more than 15 years) group. Our results suggest that there is a potential role for miRNAs in the molecular pathogenesis, clinical progression and prognosis of LUAD, and highlights the potential of miRNA profiling to improve clinical prognosis in patients with LUAD.
LUAD, constitutes about 30 - 40% of NSCLC, and is a global public health problem, representing the most common cause of cancer-related death [
1]. Owing to immense heterogeneity from multiple aspects (pathology, molecular, clinical, radiology and surgery)observed in LUAD patients, the development of individualized cancer treatment and prediction of patient outcome have been huge challenges [
24]. In the past decade, several molecular markers and models have been proposed or developed within specific NSCLC subgroups. In particular, the identification of driver mutations in the EGFR and anaplastic lymphoma kinase (ALK), introduced a new era of targeted therapy in LUAD [
25,
26]. Treatment choice and monitoring of patient outcome based on the analysis of mutations in other key biomarkers including
Her2, PIK3CA, BRAF, NUTM1, MET, ROS1, FGFR1,
KRAS and
PTEN may also have a potentially powerful clinical impact [
27‐
29]. Furthermore, gene expression profiling by microarrays or RT-PCR has also been used to classify or predict prognosis in patients with lung cancer. Owing to the large numbers of genes and the low prevalence of mutations, it may be more effective to use miRNA rather than gene expression profiles, to classify various cancer subtypes [
30]. MiRNAs are small, conserved non-coding regulatory RNAs in humans, and they play important roles in carcinogenesis. Each miRNA may post-transcriptionally regulate hundreds of downstream genes by targeting the 3’ untranslated region of specific messenger RNAs for degradation or translational repression [
5,
31]. While still in the early stages of clinical development, miRNA-expression profiling of primary tumors has already demonstrated significant promise in clinical stratification and monitoring of therapy [
32].
Several groups have identified miRNA signatures capable of predicting clinical outcome in NSCLC patients. In one miRNA profiling study based on a cohort of 357 stage I NSCLC patients, a miRNA expression signature containing 27 miRNAs was identified that was capable of accurately predicting which stage I LUAD patients may benefit from more aggressive therapy [
10]. A study of 112 NSCLC patients (57 squamous cell carcinoma [lung SCC] and 60 LUAD, stage I- III, Asian patients) identified a five-miRNA signature (including miR-221, let-7a, miR-137, miR-372 and miR-182∗) as an independent predictor of cancer relapse and survival [
7]. Another study, screening serum miRNAs using Solexa sequencing, followed by a self-validated study of 303 patients, identified miR-486, miR-30d, miR-1 and miR-499 as non-invasive predictors of OS in NSCLC [
6]. Boeri
et al. also found that higher levels of miR-429 correlated with a worse disease-free survival in lung cancer [
33]. A recent study confirmed three novel miRNAs (miR-662, miR -192 and miR -192*) as prognostic for distant relapse in operable lung SCC [
34]. In addition, miR-708 was shown to be associated with poor survival in LUAD from patients who had never smoked [
11]. On the basis of these studies, miRNA profiling has already demonstrated significant potential as a prognostic indicator in lung cancer. However, it should be noted that there was little overlap between the miRNAs identified as prognostic predictors of disease progression or outcome in these various studies, indicating that comprehensive validation of miRNAs identified in these screens is necessary.
These inconsistencies may be caused, at least in part, by fundamental, methodological differences in the pre-selection of candidate miRNAs. In this study, TMM normalization and the GLM method (which accounts for the sampling properties of RNA-seq data and the batch effect, respectively) were used to obtain differentially expressed miRNAs between tumor and normal tissues. Moreover, we obtained the candidate miRNAs from a list of differentially expressed miRNAs between LUAD and normal samples. This method ensured that the prognostic microRNA signature had statistically altered expression in LUAD and also had a prognostic impact on survival. However, miRNAs associated with OS and those related with occurrence of LUAD may not completely overlap. It is another reason for the discrepancy in miRNAs identified between various studies. The discrepancy may also be due to differences in sample size, individual patients or the study population or the different platforms used. Since miRNA expression profiles strongly differ between LUAD and lung SCC [
8], the LUAD-specific target miRNAs identified in this study may have further potential application in predicting the clinical outcome in patients with LUAD and revealing targets for the development of therapy.
In this study, we selected only common miRNAs related to clinical outcome in the non-overlapping subclasses, from the same class as the potential prognostic miRNAs. For this reason, several of the miRNAs previously identified as being associated with OS in lung cancer were not obtained, since they were only significant within a single subclass in the TCGA cohort. Among the eight miRNAs, miR-31 has been validated as a marker for lymph node metastasis in lung cancer [
35]. MiR-31 has been shown to act as an oncogenic miRNA by targeting specific tumor suppressors, including the large tumor suppressor 2 (LATS2) and PP2A regulatory subunit B alpha isoform (PPP2R2A) [
36], its high expression has been associated with poor survival of lung SCC [
37]. In contrast, in a study of 164 NSCLC patients, low miR-375 expression in plasma was associated with worse OS [
9]. Down regulation of miR-375 in tissues was also significantly associated with poor outcome in patients with esophageal SCC [
38]. It was proved that miR-101 expression was significantly associated with pathological stage and lymph node involvement, and might play an important role as a biomarker for prognosis and therapeutic targets of NSCLC [
39], (through directly targeting enhancer of zeste homolog 2(EZH2) [
40]). For the remaining five miRNAs, to our knowledge, there are no associations reported between these and OS in lung cancer. MiR-196b has been identified as a biomarker, capable of distinguishing lung SCC and LUAD [
41]. It also demonstrates potential prognostic value for disease progression in gastric cancer and glioblastoma [
42,
43]. Although there was no obvious evidence of an association between miR-196b and OS in lung cancer, Annexin A1, one of several validated miR-196b target genes, has been identified as a pro-invasive and prognostic factor for in LUAD [
44]. Ectopic expression of miR-187 was reported to lead to a significantly more aggressive phenotype in breast cancer cells and clear cell renal cell carcinoma [
45,
46]. Deregulation of miR-519a-1, regulated by phospho (p)-ΔNp63α, in head and neck SCC cells, led to the subsequent modulation of several target mRNAs including TP73, YES1, PARP1, HIPK2, ATM, CDKN1A, CASP3, DDIT4, BCL2 and BCL2L2, and YAP1, that are involved in apoptotic processes [
47]. Similarly, overexpression of miR-766 was shown to significantly inhibit the expression of pro-apoptotic genes caspase-3 and Bax in acute promyelocytic leukemia cells [
48]. Previous studies have also shown that miR-331-3p, a member of miR-331 family, may be involved in cell cycle control by targeting the 3′-untranslated region of the cell cycle-related molecule, E2F1 [
49]. The ORA in this study also revealed a significant enrichment of miRNA targets involved in NSCLC and SCLC KEGG pathways. Genes involved in apoptosis/regulation of cell cycle, the categories which were enriched within the target genes of our eight miRNAs, are implicated in LUAD tumorigenesis and represent potential therapeutic targets [
50]. Several genes involved in these pathways, such as AKT2, TP53 and TNF, have been identified as the key biomarker of LUAD prognosis [
51‐
53]. Our in silico pathway enrichment analysis based on the predicted target mRNA genes, suggested that variation in miRNAs expression might affect critical pathways involved in LUAD progression. Since all target prediction algorithms generate certain fraction of both false positives and false negatives, further research is warranted.
Lung cancer in non-smokers has recently been recognized as a distinct disease entity, owing to the striking demographic, clinicopathological and molecular differences between lung cancer in never-smokers and ever-smokers [
54,
55]. Due to its prominence in Asian countries and increasing trend in most developed countries [
56], investigations and clinical trials should be undertaken to determine the underlying causes and factors affecting progression of non-smoking-related lung cancer.
Several studies have linked smoking to poor outcomes among patients with lung cancer [
57‐
59]. However in TCGA LUAD cohort, there was no significant difference in OS between smoking and non-smoking groups (median survival time: 42.9 months
vs. 49.7 months). Intriguingly, we found that the eight-miRNA signature exhibited superior performance in predicting the 5-year survival of patients with lung cancer who had never smoked or who had ceased smoking more than 15 years ago. To examine the difference in AUCs, we compared the clinical characteristics between smoking and non-smoking groups. We found the only significant correlation between smoking history and clinicopathological features to be age. Smoking is more common among young patients in TCGA LUAD cohort. About 72.4 per cent (126 of 174) of TCGA LUAD patients diagnosed at a young age (<or = 65 years), were current smokers or reformed smokers of less than 15 years. However, there was no significant association of young age with poor OS and we did not find-better AUC in young age groups. This suggests that miRNA profile of the smoking- and non-smoking-related lung cancer may be fundamentally different, requiring further study. Previous reports have shown that some of the eight miRNAs identified in this study, such as miR-31 and miR-101, to be potential cigarette smoke-mediated deregulated miRNAs in lung cancer [
60]. This prognostic miRNA signature classifier for non-smoking-related LUAD may help clinicians to pinpoint those LUAD patients at high risk of unfavorable OS.
There are number of limitations to this study. A major limitation was the lack of available information regarding adjuvant therapy and EGFR mutation status, which defines distinct molecular subsets of resected LUAD and also predicts whether tumors are sensitive to EGFR tyrosine kinase inhibitors [
53]. Such information is required to further study the interaction between the prognostic effect of their status and the miRNA signature. A further limitation was that the TCGA LUAD cohort had a relatively short follow-up period (median follow-up of 15 months) and the censored rate was high, which may affect the reliability of the Kaplan-Meier estimates. There are also limitations in obtaining all the data from a single source and randomly assigning samples to training and testing sets for the development and assessment of the prognostic model. Independent external validation sets with long-term follow up to provide a realistic assessment of the performance of this miRNA signature would be more reliable.
Competing interests
The authors declare that they have no of interest.
Authors’ contributions
XLL designed the study, performed data analysis and drafted the manuscript. YRS participated in the collection and analysis of data. ZHY and XXX verified the bioinformatics analysis. BSZ conceived the study and participated in its design and coordination. All authors read and approved the final manuscript.