LUAD is one of the most common cancer types, threatening the human health around the world. The development of targeted therapies, especially those targeting on
EGFR [
3] and
ALK [
5], have promoted the treatment of some LUAD patients, however, the highly heterogeneity of LUAD makes the benefits of these therapies limited to few patients. Through this study, prognostic meaningful lung adenocarcinoma subtypes which are independent of
EGFR and
ALK mutations and the relevant mutational and expressional profiles were identified. We provide an alternative way to classify the LUAD subtypes which showed differential transcription profiles and remarkable prognosis differences, and provide promising prognostic biomarkers for the subtypes.
With the development of high-throughput biological and chemical technology, a great deal of omics-data is accumulated to help describe the molecular mechanisms of different types of cancers. Owing to omics-data measured for LUAD cohorts [
15,
34,
38], a large number of significantly mutated, prognosis-relevant or differentially expressed genes (e.g.,
EGFR,
TP53,
KRAS) for LUAD can be identified. However, the high heterogeneity and complicated molecular patterns of LUAD makes it insufficient only fucosing on these limited hallmark genes. It is essential to obtain a more comprehensive view on the molecular mechanism of LUAD, rather than solely focusing on the hallmark mutations. Omics data provide a valuable resource to identify potential prognosis relevent genes. However, tranditional survival analysis always led to a large number of statistically significant genes where many indirect ones were mixed in. In this study, we not only identified the prognosis-relevant genes, most of which are independent of the LUAD hallmark mutations (e.g.,
EGFR,
KRAS,
ALK), but also constructed the potential causal regulating structures among these genes, thus identifying which genes are more likely to play master roles in influencing the LUAD patients’ prognosis in the transcriptional level, providing more new promising therapeutic targets for the heterogeneous LUAD patitens. Based on these master genes, we also identified two potential LUAD subtypes. The poor survival rate of one sub-type may be related with mutations in
SMARCA4,
KEAP1,
TP53 and
COL11A1. Low expression of SMARCA4 has been reported to be significantly associated with poor prognosis and can be served as a predictive biomarker of increased sensitivity to platinum-based therapies [
39]. Here, the significant mutations of SMARCA4 were also related with the poor survival rate of one LUAD subtype (Fig.
4), and the mutations may lead to decreased expressions of SMARCA4 (see Additional file
1). Similarly, KEAP1 [
40], TP53 [
41] and COL11A1 [
33] have all been reported to play roles in LUAD. Co-occurrence of these SMGs in the poor survival subtype implies that the differential prognosis between the two subtypes is not simply the result of one specific gene but a collection of meaningful genes. Simply targetting on one specific SMG is not sufficient to resist the disease progression, alternative treatment targets, like key downstream elements, should also be considered. Accordingly, a collection of drugs like Estriol, Ethinyl and Folic acid [
28] are identified as promising drugs for the identifed poor prognosis subtypes, these drugs can target on genes which may be highly contributable to the survival differences between the two identified subtypes like
GAPDH,
CCNA2 and
PSMD2. Meanwhile, the molecular mechanism underlying the two sub-types is associated with multiple down-stream pathways, e.g., mTOR signaling pathway and lysosome.
An important issue of omics-based cancer studies is whether the revealed results can be re-discovered in the other independent cohorts despite cancer heterogeneity or sample biases. Here, based on the expressional profile of master genes, the two identified subtypes were consistent in multiple independent cohorts, confirming the robustness of the identified subtypes which showed significant differences both molecularly and clinically. The robustness of the subtypes also imply that the causal regulatory network based method help identify the most influential genes. These results can provide an alternative way to classify LUAD patients and supply valuable references on selecting the most beneficial treatments for specific type of LUAD.
A limitation of this study is that most of the calculated relationships were significant in the statistical level. It is unavoidable that false positives are mixed into these statistical relations, e.g., the causal regulating effects. However, these findings still provide remarkable data resources, which may promote the discovery of promising molecular mechanisms underlying LUAD in a less time-and resource-consuming way. In the future research, more efforts will be put into validating these potential relations.