Background
The integration of next-generation sequencing together with other high-throughput techniques has provided an excellent opportunity for the study of molecular alterations occurring in cancer [
1,
2]. In particular, the platforms for gene expression profiling have been widely used to identify cancer biomarkers. Over the last few decades, it has been recognized the general idea that a singular alteration can not cause cancer, rather than it was recognized as the result of a wider sequence of genetic and genomic events occurring during the progression from normal epithelial tissue to metastatic disease [
3,
4]. For this reason, methodologies based on the use of gene signatures, i.e., lists of genes sharing a common pattern of expression among multiple tumor types, is currently recognized as a more biologically significant approach to understanding the biology of cancer [
5].
In the present study, we identified 105 onco-signatures associated with the more frequent mutational events shared among the various cancer types. As proof of principle, we evaluate the power of the derived onco-signatures to classify TCGA breast cancer patients in relevant groups with distinct biology and clinical outcome. In addition, we have successfully identified two different metabolic subtypes of Luminal tumors based on 28 specific breast cancer prognostic onco-signatures.
Here, we propose a novel methodological framework to identify commonly shared onco-signatures in cancer that can contribute to the understanding of the role of alterations in tumor disease and the identification of novel molecular mechanisms useful for developing precision therapeutic strategies.
Discussion
In this study, we presented a novel pipeline of analysis to identify robust onco-signatures potentially able to predict disease outcome in cancer patients. Using a pan-cancer discovery set of 9107 primary tumor samples together with respective matched mutational data, and a list of known cancer-related genes, we identified 105 onco-signatures, each one composed by a group of distinct marker genes. Aiming to investigate the predictive power of the 105 onco-signatures in breast cancer disease, the Cox proportional hazard regression model was constructed using the TCGA BRCA gene expression dataset, identifying 28 BRCA survival-associated onco-signatures. Next, by performing a gene-set enrichment analysis followed by an unsupervised hierarchical cluster analysis of NESs we identified four discrete breast cancer groups of clinical relevance. Our approach has successfully stratified the Basal-like breast tumors but not the Luminal tumors, who showed high diversity in terms of overall survival across the different clusters. Confirmation of the prognostic difference observed for Luminal cancers enriched in the four identified groups encouraged in-silico molecular analyses to discover the associated genetic variables, which showed profound differences between the more extreme Luminal phenotypes (i.e., Cluster 2 and Cluster 4) with respect to differential gene expression, CNV status, and activation of oncogenic signatures.
Differential gene expression analysis between the two groups of interest provided additional details on their molecular status. Cluster 4 Luminal tumors showed up-regulation of genes linked to mitochondrial respiration and oxidative phosphorylation. In contrast, Cluster 2 Luminal tumors displayed enrichment of genes involved in the development of central nervous system components, and extracellular matrix organization. Looking at genomic imbalances related to the two clusters we also noted that at the genomic level Cluster 4 tumors exhibited a higher frequency of amplifications and deletions as compared to Cluster 2 samples, although at genic-level these imbalances affected a higher number of genes in Cluster 2. Interestingly, we found that the Cluster 2 tumors were enriched in amplifications borne by genes involved in several metabolic processes. On the other hand, the deleted genes in Cluster 2 are involved in immune-cell suicide mechanisms. Although it is not possible to define the impact that such molecular events might have caused, some findings captured our attention. Currently, cancer is considered both a proliferative disorder and a metabolic disease [
37,
38]. It is well known that different breast cancer subtypes have distinct bioenergetic and metabolic phenotypes which are associated with different survival outcomes [
38]. For example, the Luminal-like tumors present a higher mitochondrial respiratory rate compared to the more metastatic Basal-like cancers that require, instead, an intensive glycolytic flux together with the reduction in OXPHOS processes [
38]. In line with our findings, it is possible to hypothesize that an altered mitochondrial metabolism in Cluster 2 may be linked to an unfavorable survival outcome for the Luminal tumors enriched in this group, conferring them a growth advantage. In addition, since recent studies have demonstrated that mitochondrial metabolic processes are modulated by tumor cell-microenvironment [
39‐
42], the specific deletion of genes involved in immune-cell death pathways that we found in Cluster 2 could reveal a potential cross-talk between mitochondrial dysfunction and the cancer immune microenvironment of this cluster.
Differential miRNA expression analysis provided additional details on the biology of the two subgroups of Luminal tumors in comparison, identifying the hsa-miR-135a-5p as the most up-regulated microRNA enriched in the cluster with the longer overall survival, i.e., Cluster 4. It has been demonstrated that mammary tumors display an altered expression of the microRNAs, many of whom function as oncogenes or tumor suppressors and modulate a variety of biological processes such as cell proliferation, migration, invasion, metastasis, apoptosis, differentiation, and cellular metabolism [
43,
44]. Dysregulation of miR-135a-5p has been described in several cancer types [
45‐
47]. Studies on the biological function of miR-135a in cancer have shown that it can play both oncogenic and antitumor roles depending on the cancer type, although it has been described that in breast cancer miR-135a-5p overexpression is able to inhibit EMT by acting through Wnt/β-catenin signaling pathway [
28,
48]. In addition, other investigations have also linked mitochondrial activity to epithelial–mesenchymal transition in breast cancer, suggesting that the down-regulation of CDH1 and CTNNB1 in triple-negative breast tumors is correlated to a significant decrease in mitochondrial respiration [
49]. Interestingly, our findings confirm the notion that miR-135a-5p is a potential tumor suppressor in breast cancer disease and could be used as a potential prognostic marker for this pathology.
Recent studies have described a special avenue for the downregulation of miRNAs, named target-directed microRNA degradation (TDMD), which induces the direct degradation of miRNAs [
30‐
33,
50‐
52]. Although so far there are few studies on the TDMD mechanism, Simeone et al. [
34] have performed the first computationally prediction of TDMD inducers in mammalian genomes making available their prediction in TDMDfinder webtool (
http://213.82.215.117:9999/TDMDfinder/index.php). Investigating the possible cause that could explain the up-regulation of miR135a-5p in Cluster 4, or alternatively its down-regulation in Cluster 2, we queried TDMDfinder tool, which has predicted two high confident TDMD targets for hsa-miR-135a-5p. When we evaluated the expression levels of two potential TDMD inducers in our cohorts, we found that they are predominantly higher in Cluster 2 compared to Cluster 4. In addition, we also found that the biological functions of the two predicted TDMD-genes were associated to ontologies linked to neuronal tissues, where this mechanism was originally described as being particularly active, and pathways frequently altered in human tumors [
30,
33]. Taken together these results suggest that the TDMD mechanism may be operative in Cluster 2.
To obtain better insights into the functional roles of the two different metabolic groups of Luminal breast tumors identified in this study, we also conducted analyses to evaluate both their immune infiltrating cell composition and the chemotherapeutic sensitivity to several drugs. Our findings showed that the tumor microenvironment of the Cluster 4 was characterized by a higher infiltration of anti-cancer effector cells, like the γδ T cells, T follicular helper cells, Macrophages M2, and natural killer cells, which are known to contribute to a good prognosis. Conversely, in the Luminal Cluster 2 samples, there was obvious immunosuppressive cells (e.g., the regulatory T cells) infiltration. This result may, at least partially, explain the favorable survival outcome observed in Cluster 4. In addition, in the present study we also showed that the Cluster 2's Luminal tumors were more sensitive to three chemotherapeutic compounds, i.e. Entinostat, Olaparib, and BI-D1870. Entinostat is an oral inhibitor of class I histone deacetylases (HDAC1) that shows a potent antiproliferative effect in breast cancer. Infact, mounting preclinical evidence suggests that it may have a role in immunogenic modulation inhibiting regulatory T cells and promoting tumor infiltration of lytic CD8+ T cells [
53]. BI-D1870, instead, is a potent small molecule inhibitor of p90 ribosomal S6 kinases (RSKs) and it is widely used experimentally to revert the EMT phenotype in breast cancer cell lines since it can powerfully inhibit the growth of breast cancer cell lines [
54,
55]. Olaparib is an oral poly(ADP ribose) polymerase (PARP) inhibitor that has promising antitumor activity in patients with aggressive forms of breast cancer disease. It, in fact, is the first treatment FDA (Food and Drug Administration)-approved specifically for BRCA mutation carriers with HER2-negative metastatic breast cancer [
56‐
57]. Thus, the Luminal cancers characterized both by a low metabolic state and the Cluster 2-like genetic features group may also be valuable for clinical treatment, since our results demonstrated that this group of tumors was more sensitive to the different chemotherapeutic agents.
However, there are also limitations in our study. As in many studies on cancers, all omics data used in this work (CNV data, methylation data, mutational data, and miRNA expression data) could not be retrieved for additional datasets in order to perform a punctual in-silico validation. In addition, one of the more used breast cancer datasets has been created using microarray technology, making reproducibility of the onco-signatures very difficult. The lack of external independent validation results in a limit for our study. For this reason, future computational studies aiming to examine additional datasets, followed by biological validations useful to consolidate our findings are desirable.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.