Background
The phosphatidylinositol 3′-kinase (PI3K) signaling pathway is one of the most frequently altered pathways in human cancers that impacts major hallmarks of malignancies [
1] and remarkably contributes to cancer initiation [
2,
3], progression, metastasis, metabolism, and cell survival [
4‐
8]. Aberrant PI3K signaling activity mostly accounts for the poor outcomes and tumor relapse seen in cancer patients [
7] [
9‐
11].
PIK3CA, a member of the PI3K family, encodes the p110α protein, and the p110α protein is a subunit within the PI3K catalytic domain [
12]. Hotspot mutations of
PIK3CA were found in a broad scope of cancers, including melanoma, breast cancers, colorectal cancers, gastric cancers, liver cancers, etc. [
4,
13], and most of these mutations have been confirmed to promote an oncogenic gain-of-function effect. A considerable fraction (approximately 10 ~ 30%) of cancer patients carry abnormally activated
PIK3CA mutants [
8,
14]. These
PIK3CA mutants enabled reduced dependence on growth factors, anchorage-independent growth, and enhanced resistance to apoptosis [
15‐
17].
Interestingly, we noted that
PIK3CA mutants cooccurred with other genetic alterations that promote enhanced PI3K signaling [
18,
19]; for example, mutations in
PIK3CA were found concomitantly with mutations in
EGFR, KRAS, or
ALK in the same tumors of lung cancer patients [
5]. However, considering the complex genetic alteration context in cancer and the complex traits leading to higher-order mutational interactions, paired mutations might lack sufficient accuracy and robustness for precise applications in diagnosis and prognosis. Thus, a complete investigation of the genetic factors in higher-order combinations could be key to improving prognostic accuracy.
Intriguingly, in economics, the benefit experienced when adding one extra unit is called the marginal benefit [
20]. We found that some lower-order gene mutants (we referred to lower-order mutants as those involved in combinations of fewer than three elements), defined as marginal factors in this study, could substantially affect the overall survival (OS) probability when combined with other specific cooccurring mutation combinations, defined as seed combinations, in a higher-order interaction (we referred to higher-order factors as those involved in combinations of equal to or more than three) (Supplementary Fig.
1). The effects of such higher-order combinations were categorized as higher-order genetic marginal effects. This study focused on the contributions of
PIK3CA mutants to higher-order genetic marginal effects on the hazard ratio (HR) and overall survival (OS) probability across three different cancer types. Therefore, we established a
PIK3CA-based subtype classification based on the frequencies of
PIK3CA pairing partners to decisively identify marginal factors affecting the HR of the original mutational combination, attempted to reveal the mechanistic differences between these subtypes, and identified various small-molecule drugs according to the unique
PIK3CA subtype signatures. To the best of our knowledge, this is the first study to systematically investigate the functional properties of
PIK3CA from a higher-order point of view. Thus, this study provides a better understanding of the mutually interacting relationships between specific higher-order combinatorial mutations and
PIK3CA mutants across a wide variety of cancer types and provides a framework to rationally identify and exploit factors interacting with
PIK3CA mutations to induce a marginal effect.
Methods
Obtaining the clinical and genetic data
Identifying the genetic combinations that could impact the survival outcomes of cancer patients
Somatic mutation (SNP and INDEL), survival, and clinical datasets from BRCA, COAD, and STAD patients were extracted from the TCGA database. To evaluate the HR of the genetic combinations consisting of the PIK3CA mutants, we removed the “silent” mutations and selected the 35 most frequently occurring mutational genes by adopting a Cox proportional hazard regression model.
The original dataset samples were split: 60% of samples served as the exploration set, and 40% of samples served as the validation set. Then, the samples were further processed into the.maf format via the maftools package. Next, we selected the thirty-five highest ranked mutated genes from BRCA, COAD, and STAD patients as the research objects, enumerated the potential combinations that may contain one to six mutated genes, and calculated the HR and
p-value through Cox hazard ratio modeling via the survGroup() function from the maftools package [
21]; this analysis identified the impacts of the genetic sets on patient OS time and status.
Discovering genes that caused significant marginal effects on the hazard ratio
The combinatorial gene sets of different orders were crossed and merged according to the shared gene names, and the conditions used to acquire the key gene that was causing the marginal effect on HR are listed as follows:
1) The p-value of the relative lower-order combination is larger than 0.3, while the p-value of the relative higher-order combination is lower than 0.01.
2) The HR of the relative lower-order combination is larger than 1. In comparison, the HR of the relatively higher-order combination was two times larger than the HR of the relatively lower-order combination, and the p-value of the relatively higher-order combination was lower than 0.05.
3) The HR of the relatively lower-order combination is larger than 1, while the HR of the relatively higher-order combination is lower than 1, and the p-value of the relatively higher-order combination is lower than 0.05.
The mutational combinations that generated significant marginal effects were ranked according to the fold-change of the higher-order HR against the lower-order HR, and the results were uploaded to the Synapse repository:
https://www.synapse.org/#!Synapse:syn23530651.
Plotting the Kaplan-Meier survival curve
The Kaplan-Meier survival curves were plotted using the surv_fit() function from the survminer package and the mafSurvival() function from the maftools package in the R 4.0.2 platform. These analyses were used to demonstrate the survival probability of patient cohorts with different mutational combinations. The p-value was calculated by the log-rank test, and the HR was calculated from a Cox proportional hazard model.
Classifying the marginal effect-specific PIK3CA subtypes of BRCA, COAD, and STAD patients
Gene mutations that paired with PIK3CA mutations to induce a marginal effect and the original higher-order combinatorial constituents were summarized statistically in an individual cancer type. The genes within the same gene family were considered the same mutant type. We determined the belonging of genes to a gene family through the 35 selected genes by summarizing their common prefix characteristics. Genes with the same prefix characteristics were considered to belong to a gene family. The partners of the PIK3CA mutations that had the highest frequencies were chosen as the standards for candidate subtype classification.
RNA differential expression analysis
Since the transcriptomic expression data (TCGA Stomach Cancer (STAD): IlluminaHiSeq UNC) we obtained from the Xena UCSC were as in log2(x + 1) transformed RSEM normalized count, we then restored the integer type of RSEM normalized read counts by round (2log2(x + 1)) method to meet the requirement that the input data should be an integer type, when using DESeqDataSetFromMatrix() from DESeq2 package, and transformed them into DESeq2 data through the DESeq2 package, and STAD patients with LRP1B mutation but without PIK3A + HMCN1 comutation were chosen as the reference group to be compared with STAD patients with the LRP1B + PIK3CA + HMCN1 trimutation. The data were then entered into the geom_point() function of the ggplot2 package for further plotting tasks; the fold-change threshold was ±2, and multiple test correction was applied based on an adjusted P cutoff value of 0.0001.
Integrating the KEGG pathway-based data
The DEGs between the LRP1B without PIK3CA + HMCN1 patient cohort and the LRP1B + PIK3CA + HMCN1 cohort were obtained with a fold-change threshold of ±2 and a p-value threshold of 0.01. Next, the gene symbols of the DEGs were entered into EntrezID, and the fold-change values of the DEGs were calculated through the clusterProfiler package in the R 4.0.2 platform via the compareCluster (fun = ‘enrichKEGG’) function.
Generating a connectivity map for selecting marginal factor-targeting compounds
To determine which small compounds might be effective against
LRP1B mutation to induce a marginal beneficial effect, we entered 107 genes upregulated in the
LRP1B single-mutation samples with poor prognosis compared with the
LRP1B + PIK3CA + HMCN1 trimutation samples with the Broad Institute’s Connectivity Map [
22,
23], a public online tool (
https://clue.io) (with registration), which enabled us to select the molecular compounds that can activate or inhibit the specific biological processes underlying each gene expression signature. The signature strength, replicative correlation (75th percentile), transcriptional activity score, and connectivity score thresholds were 200, 0.2, 0.2, and − 0.3, respectively. We next determined the compound signature strength and replicative correlation (75th percentile), transcriptional activity score, and connectivity score by using the scatter plot function ggplot2.
Discussion
Through the systematic study of PIK3CA mutations interacting within higher-order elements, we strived to discover the crucial factors that serve as the marginal factor responsible for a remarkable survival effect and cooperate with specific mutations that constitute a higher-order combination. Our results provide unique personalized diagnostic and therapeutic insights that enable researchers to leverage the beneficial/adverse effects of PIK3CA mutations within the context of specific genetic combinations that influence survival probability, and they reveal the oncogenic RNA expression pattern arising from the marginal effect, which was further exploited by using CMap analysis to find potentially efficacious therapeutic agents.
Notably,
PIK3CA mutations have been found to have a synergistic effect with mutational inactivation of PTEN in inducing drug resistance [
36,
37] or in inducing poor prognosis with
EGFR/KRAS comutations in nonsmall cell lung and colorectal cancer [
38,
39]. However, it has been found that the efficacy of PI3K-targeting agents might be inadequate to treat some patients with refractory resistance even when combined with the MEK inhibitor trametinib, and this lack of efficacy might be caused by differential compensation mechanisms that confer resistance to PI3K inhibitors in
PIK3CA-amplified head and neck cancer cells [
40]. The lack of efficacy might also result from the accumulation of somatic genetic alterations as the tumor advances. Genetic or proteomic interactions with hub nodes such as
PIK3CA might vastly increase, potentially contributing to cell resistance to these targeted therapies [
41]. Given this potential, comprehensively assessing and modulating the significant hazardous effects of the current mutational contexts in patients could be the key to resolving the efficacy issue. Since
PIK3CA has mostly been studied individually as a resistance driver gene and its potential genetic interactions with other genes or mutations at a higher-order level have mostly been ignored, we performed a large-scale systematic investigation to uncover the marginal effect of complicated higher-order interactions of
PIK3CA across multiple
PIK3CA-affected cancers.
Intriguingly, we found that
PIK3CA mutations in STAD mostly acted as the marginal factor that primarily facilitated the survival effect within specific combinations of gene mutations, and most of these effects were beneficial, whereas
PIK3CA mutations in BRCA and COAD appeared in both the marginal factor and seed mutation role and mostly induced adverse effects on survival. For example, the marginal factor dimutation of
MYCBP2 + TTN contributed to very poor prognostic outcomes in BRCA patients carrying
PIK3CA mutations (Supplementary Fig.
4A-4B). Determining why mutations in
PIK3CA could interact to produce distinct prognostic outcomes in STAD patients compared to BRCA and COAD patients is another area of research worthy of future research.
Moreover, our findings also showed that the RNA expression signature underlying the marginal effects can be exploited as an alternative target; we determined the PIK3CA-induced RNA expression pattern inducing the marginal effects on survival and analyzed this expression signature through CMap. The analysis suggested a novel strategy to select appropriate therapeutic targets and thus increase personalized therapy precision.
Interestingly, it is worth noting that the evolutionary order of occurrence of the higher-order interacting comutations could be useful for delineating the evolution of cancer when the pioneer mutation is known. To characterize the mutation occurrence order of the
LRP1B + HMCN1 + PIK3CA mutational combination, we followed the basic principle that the occurrence of mutations is relevant to the pathological stage [
42,
43]. We then tried to analyze how the relative mutation patterns evolved with advancing tumor pathological stage. Interestingly,
LRP1B, as the seed mutation, consistently exhibited the highest relative frequency (sunset red line) (Supplementary Fig.
5). The relative frequency of
LRP1B +
HMCN1 dimutation drastically decreased from pathological stage I to pathological stage II (yellow line), whereas the frequencies of
PIK3CA single mutation (light blue line) and the trimutation of
LRP1B +
HMCN1 +
PIK3CA greatly increased from pathological stage I to pathological stage II (dark blue line) (Supplementary Fig.
5). These results indicate that the
LRP1B and
HMCN1 mutation and comutation of
LRP1B +
HMCN1 might occur first at an early pathological stage, stage I, of STAD. With progression of tumor stage, the relative frequency of
PIK3CA mutation increased, and tumors carrying dimutation of
LRP1B +
HMCN1 might much more likely to acquire
PIK3CA mutations in pathological stage II, which could explain why the relative frequency of dimutation of
LRP1B +
HMCN1 severely decreased (yellow line) as the relative frequency of trimutation of
LRP1B +
HMCN1 +
PIK3CA increased (dark blue line). Moreover, the relative frequencies of
HMCN1 +
PIK3CA as the marginal factor (blue line) and trimutation of
LRP1B +
HMCN1 +
PIK3CA as the marginal factor (dark blue line) decreased from pathological stage II to stage III and stage IV, but the relative frequencies of
LRP1B +
HMCN1 dimutation (yellow line) and
LPR1B +
PIK3CA dimutation (green line) showed an increasing trend, indicating that mutations of
HMCN1 and
PIK3CA might become mutually exclusive and might not be good for the prognosis of STAD patients at these stages.
Finally, we must clarify that the strategy to include the genes with the highest mutation rates was based on the assumption that a higher mutation rate may be shared by more patients, especially when these genes were combined. Including genes with a relatively lower rate of mutation may not be shared by sufficient patients for subsequent sufficiently valid testing. On the other hand, since enumerating and calculating the mutational combinations constituted by these single candidate genes largely expanded the magnitude of the computational load, we strived to include as many candidate genes as possible if the computational load was allowed. Thus, considering both limited factors above, we finally decided to include the top ranked 35 mutated genes in this study, but the diversity of mutations are widely existed, we believe more biological meaningful mutations would be discovered and researched in the future. Moreover, although the systematic analysis and identification of factors, including PIK3CA, producing a significant marginal survival effect when interacting within higher-order combinations in this research produced essential results, more works still need to be done, especially to mechanistically extend our findings beyond RNA profiling data. Future analyses require more specific research materials, such as appropriate cell and animal models, and more detailed functional analyses of potential PIK3CA-specific marginal effects on cellular behaviors, such as cellular communication, transformation, immunogenicity, and alteration of subcellular morphology.
Conclusions
This research addressed the importance of studying PIK3CA mutation-specific effects on survival in the context of multiple mutations at the higher-order level and the value of analyzing the PIK3CA mutation-induced marginal effect. Moreover, this study enabled the identification of vital factors relevant to PIK3CA that can produce remarkable differences in survival in BRCA, COAD, and STAD. Specifically, we focused on a trigenic mutational combination, PIK3CA + HMCN1 + LRP1B, that promoted a beneficial survival effect in STAD patients. We next elucidated its mechanisms at the RNA expression level, and we determined the pathways enriched by the DEGs: pathways such as metabolisms of certain amino acids, protein digestion and absorption, fat digestion and absorption. Finally, we compared the specific RNA expression signature arising from the marginal effect with the RNA expression signature induced by specific small molecular compounds via CMap analysis. In addition, we assessed the corresponding compounds to determine their utility for recreating the effects created by PIK3CA + HMCN1 mutation as the marginal factor for effective targeting of LRP1B mutational subtype tumors. However, the scale of the analysis of higher-order combinations could be increased by including more mutations, despite the massive computation capability needed. In the future, the mechanistic underpinnings of the marginal effect should also be studied with integrated omics data from miRNA, proteome, and methylation studies.
In summary, the systematic analysis of PIK3CA mutation-specific marginal effects within higher-order combinations that impact cancer patient survival allowed us to decipher complex higher-order interactions and convert these higher-order genetic interactions underlying the survival effects into practical, useful targets and drugs. These results will benefit the diagnosis and treatment of specific cancer subtypes in the future.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.