Introduction
The evidence suggesting that molecular profiling can refine breast cancer prognosis are so far promising. From cDNA microarray analysis of locally advanced breast carcinomas, Perou and colleagues [
1] identified five subgroups based on their distinct gene expression patterns. The subgroups were shown to differ with respect to outcome [
2], and have also been identified in other datasets [
3]. van't Veer and colleagues [
4] analyzed node-negative breast cancer patients under the age of 55 years using DNA microarrays and identified a 'poor prognosis signature' that predicted short interval to distant metastasis. A larger set of samples was studied by van de Vijver and colleagues [
5] to confirm the predictive value of this signature in women under 53 years of age. Other datasets have been analyzed with similar findings of molecular subgroups with different clinical outcomes [
6‐
10]. However, there are few published studies with a relatively large number of patients with long-term follow-up.
Several well-established clinical, histopathological and molecular factors are today used as prognostic and predictive markers of breast cancer. These include patient age, tumor size, lymph node status, presence of distant metastasis (TNM-stage; tumor, node, metastasis), histological type, tumor grade, and estrogen receptor (ER), progesterone receptor (PR) and ERBB2/HER-2 status. Improvements of prognostic criteria have been achieved by optimally combining available markers. The National Cancer Institute [
11] and St Gallen Conference [
12] provide adjuvant treatment guidelines based on these markers. Currently, TNM-staging [
13], the Nottingham Prognostic Index [
14] and Adjuvant Online [
15] are the most commonly used integrated prognostic models.
TP53 mutation status is rarely obtained for routine analysis, despite accumulating evidence of its prognostic value. Mutations in the
TP53 gene have been reported to be present in more than half of all cancer cases [
16]; however, the frequency shows variation between types/subtypes of cancer. In breast cancer, the frequency of
TP53 gene mutations is approximately 20% to 30%. Acquiring a
TP53 mutation has been suggested to be an early event in breast cancer development and it is related to poor prognosis and chemo resistance [
17]. Allelic imbalance (AI) (or loss of heterozygosity (LOH)) at chromosome location 17p13, where the
TP53 gene is located, has been reported in more than half of breast carcinomas [
18]. Traditionally, AI is considered as an additional event eliminating the
TP53 tumor suppressor function.
In this study we address the question of whether gene expression profiles offer better prognostic information in patients with long-term follow-up. We performed univariate and multivariate analysis of seven standard markers and TP53 mutation status for the total group of breast cancer patients. We then analyzed a large subset of these tumors using cDNA microarrays and assigned the samples to five previously defined molecular expression groups. The strength of gene expression-based classification versus standard markers was evaluated by adding this variable to the Cox regression model used to analyze all samples. This is the first report that includes both gene expression groups and TP53 mutation status in a multivariate analysis.
Discussion
In the patient series analyzed here both uni- and multivariate analysis show that
TP53 mutation status was a very pronounced prognostic factor. Although some studies have reported similar findings, others have found a weaker prognostic power for
TP53 mutation status [
37], which may be due to the mutation screening approach used (as well as population differences). The most frequently used method for mutation screening of the
TP53 gene has been IHC, which detects only mutations that induce protein accumulation, missing frameshift, nonsense and splice mutations. In this study, several of the missense mutations with high levels of mRNA expression were also missed by IHC (Figure
4), showing the insufficiency of this technique for mutation screening. The TTGE/sequencing analysis detected 15% of the
TP53 mutations outside exons 5 to 8, supporting the importance of analyzing the whole gene and not only exons 5 to 8 as many previous studies have done.
A key issue is whether
TP53 mutation status is a prognostic marker or instead a marker of therapy response only (predictive marker). The results in Table
2 (Multivariate analysis) show the total effects of tumor size and
TP53 status on survival, effects that may be direct and/or indirect via adjuvant treatment. When including adjuvant therapy in the multivariate analysis, RR values similar to those in Table
2 were found (
TP53, RR 5.1), indicating that the total effects are mainly a result of the direct effects, not indirect effects via treatment. Analysis of patients receiving surgery only (no adjuvant treatment) also gave similar result (
TP53, RR 4.3). Although several studies have suggested that
TP53 mutation status is a predictive factor [
38,
39], randomized large-scale studies are needed to make certain of this.
TP53 mutation status may be both a predictive marker of some treatment regimes as well as a strong prognostic factor.
The strong correlation of
TP53 mutations with the basal-like subtype is a biologically important finding, and whether it is the nature of ER-negative basal-like tumors that allows mutational events in the
TP53 gene or that the basal-like gene expression profile is a consequence only of a
TP53 mutation is unresolved and should stimulate further investigation on the origin of breast tumor cells. A related question to address in larger studies is whether the specific gene expression pattern we found associated with
TP53 mutation status was a result of cellular events directly initiated by mutant
TP53 or rather a result of the dominant cell type (basal-like progenitor or cancer stem cell) in these tumors. Similar questions apply to the ERBB2
+ subtype, which also shows a strong correlation with
TP53 mutations; in addition, the sequence and impact of the ERBB2 amplification versus the
TP53 mutational event needs investigation. Sørlie and colleagues [
2] reported in their patient cohort of locally advanced breast cancer a high frequency of
TP53 mutations also within the luminal B samples (highly proliferating luminal cases). A relatively low frequency of
TP53 mutations was found within the highly proliferating luminals (2/15) in our set of patients with earlier stage tumors. We propose that
TP53 mutations may be an early and causal event in basal-like tumors whereas in luminal B (highly proliferating luminals) tumors it may be a consequence of genomic instability. The strong association found between AI and point mutations in the
TP53 gene in the basal-like and ERBB2
+ samples support the concept of
TP53 acting as a tumor suppressor gene in these tumors [
40], while the high frequency of AI despite a low frequency of
TP53 mutations in the highly proliferating luminal group suggests a different mechanism for
TP53 in these tumors.
There is a massive interest in defining gene expression profiles of breast tumors to understand the development and progression of the disease and to create a novel clinically useful diagnostic tool. Many reports are very promising, although the clinical and genetic heterogeneity of the disease does not make it straightforward to predict recurrence and outcome in individuals based on a snapshot of the biological processes in the individual tumor. Our study aimed to investigate the potential of gene expression profiling as a prognostic marker in patients with long term follow-up, and not to create yet another gene list associated with patient outcome. The extreme amount of variables (genes) and the relatively low number of cases and events increases the probability of accidental but apparently significant findings [
41] in microarray analysis. In this study we have chosen an unsupervised approach for the classification of samples. The results certainly support the huge potential of information found in expression patterns, and the classification is shown to be a statistically highly significant predictor of survival.
The Kaplan-Meier plot (Figure
3c) illustrates a significant difference in survival between the different expression groups, as seen in previous studies [
3]. Notice that the two groups with very poor prognosis had a diverse progression of disease. Breast cancer cases in both the basal-like and ERBB2
+ groups had a very high mortality rate during the first two years, while the highly proliferating luminal cases developed the disease more slowly, showing highest mortality after five to eight years. We were not able to pinpoint any specific heterogeneity (clinical, histopathological or molecular markers) of the patients within the highly proliferating luminal cluster, the group showing non-proportional hazard, and suggest the curve reflects biological characteristics. Many patients with highly proliferating luminal cancer received Tamoxifen treatment for two years, and the poor outcome in this group compared to luminal A patients could be explained by the lack of a Tamoxifen effect. Alternatively, this anti-estrogen treatment may temporarily prolong patient survival in this group for the first years they receive the drug. The different progression observed in basal-like versus highly proliferating luminal patients may be consistent with the bimodal mortality rate reported by Demicheli and colleagues [
42].
Different approaches have been used in an attempt to define clinically relevant groups based on gene expression patterns, but a consensus on how to do this has not yet been reached. In our study a classification similar to the one identified by Sørlie and colleagues [
2] was obtained, supporting the existence of such subgroups in a broader spectrum of breast tumor stages. A few samples were, for various reasons, difficult to categorize. The lack of proliferation genes in the intrinsic gene list causes a less clear correlation with the luminal B centroid, but when proliferation genes from the total cluster (Figure
1h) were included the characteristics of the highly proliferating luminals (luminal B-like) compared to the luminal A group were clearly shown. Although the majority of luminal samples were most highly correlated with the luminal A-centroid, the group we named highly proliferating luminals is clearly different from the luminal A group in the scatter chart, having the second highest correlation with the luminal B-centroid (Figure
2). We suggest that earlier stages of luminal B (here named highly proliferating luminals) may have less pronounced expression profiles than the advanced tumors where the centroids were defined (our data set versus Sørlie and colleagues [
2]). The small cluster between the normal-like and the basal-like group shows highest correlation with the ERBB2-centroid, although this group demonstrates extremely low expression of both the ERBB2 gene (Figure
1g) and basal-like genes (Figure
1f). The samples seem more normal-like based on the fact that a normal breast tissue sample clustered within this group, as well as showing expression of genes previously identified to characterize normal-like samples. This small cluster illustrates the difficulties in assigning individual samples to subgroups based on correlation with centroids alone. The correlation of each sample with each of the centroids showed a continuous pattern over the sample set, and visualizes how each sample carries elements from different profiles (Figure
2). In Figure
1g, the ERBB2
+ group on the far right side shows high expression of an ERBB2-related gene cluster and is, therefore, included in this group, despite the fact that its members also show correlation with the luminal B centroid. It is a matter of choice which group to assign these samples to. The ERBB2
+ group is defined by a molecular event (overexpression of ERBB2), whereas the luminal B group is recognized by highly proliferating ER-positive tumors. Three samples do not show increased ERBB2 expression, but they are included in the ERBB2
+ group based on their clustering. Although these samples express a low level of ERBB2 on the RNA level, it has been observed that the protein level (fluorescence
in situ hybridization analysis) does not always correspond and thus may be high.
Conclusion
The combination of gene expression groups and clinical/histopathological parameters in this study has added more details and levels of understanding to our current picture of breast carcinomas. The long follow-up of patients revealed that the highly proliferating luminal group had an even worse prognosis than the basal-like and the ERBB2+ groups. The relatively good outcome for the first five years for the highly proliferating luminal group may be explained by the natural history of these tumors or by use of Tamoxifen. The strong association found between the basal-like group and TP53 mutations suggests that such mutations may be causal in these tumors, while TP53 mutations may be a later event in the highly proliferating luminal carcinomas. The high frequency of TP53 AI in the highly proliferating luminal group supports a mechanism other than TP53 mutations causing genomic instability in these tumors, and should be further explored. The characteristic gene expression pattern found in tumors carrying a TP53 mutation also needs further investigation in larger sets of samples with various mutations included.
Both TP53 mutation status and gene expression subgroups demonstrated strong prognostic impact, and may add valuable new information that complements the established prognostic markers. TP53 may help distinguish high risk tumors in need of treatment from among small, node negative tumors, which do not currently receive adjuvant treatment (that is, they are undertreated); on the other hand, it may help avoid treatment of individuals in patient groups that today may be overtreated. The choice of treatment may, for example, be influenced by avoiding drugs dependent on TP53-mediated apoptosis or, in the future, by using drugs that target and reactivate TP53. Although gene expression-based subgroups showed massive prognostic strength, a more robust classification method is needed for future application in clinical practice. Development of a new integrated prognostic model that includes TP53 and gene expression groups could be useful in the choosing of treatment.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
AL collected updated clinical data, participated in revising the histology, carried out the gene expression analysis, organized the TP53 mutation analysis, participated in the design of the study, performed some statistical analysis and drafted the manuscript. HZ carried out parts of the gene expression analysis. ØB participated in the design of the study and carried out the statistical survival analysis. JMN carried out the histology analysis. IRKB and TI collected clinical and molecular data. RK collected the patient material. ALBD conceived of the study, participated in its design and coordination and helped draft the manuscript. SSJ participated in the design of the study, coordinated the gene expression analysis and helped draft the manuscript. All authors read and approved the final manuscript.