Main

Breast cancer (BC), the most common cancer and the second leading cause of cancer death in women, represents a heterogeneous group of tumours with varied genotypic and phenotypic features, behaviour and response to therapy. This, in addition to numbers and complexity of available treatment options, has resulted in decision making difficulties regarding the most appropriate treatment choice. Clinical decision making in personalised BC management requires robust and accurate risk stratification based not only on outcome prediction but also on a biological basis (Clark, 1994). Methods have been developed to assist in predicting patient outcome and to support clinical decision making in BC management. Examples of such methods include the Nottingham Prognostic Index (NPI) (Galea et al, 1992; Balslev et al, 1994; D'Eredita et al, 2001), St Gallen consensus criteria (Goldhirsch et al, 2009), the National Comprehensive Cancer Network (NCCN) guidelines (Carlson et al, 2006) and Adjuvant! Online (Ravdin et al, 2001).

The current NPI is based on a combination of histopathological examination of tumour size, lymph-node (LN) stage and tumour grading assembled in a prognostic index formula (Haybittle et al, 1982) and can be used as a risk stratifier in unselected cohorts of operable early-stage primary BC patients. Prognosis worsens as the NPI numerical value increases and by using cutoff points patients may be stratified into good, moderate and poor prognostic groups (Ellis et al, 1987; Blamey et al, 2007). The NPI has been confirmed after long-term follow-up (Galea et al, 1992), validated independently in large multicentre studies (Brown et al, 1993; Balslev et al, 1994) and revised to stratify patients into five prognostic groups (Blamey et al, 2007). However, the NPI cannot reveal the full clinical/survival outcome heterogeneity currently observed in BC and would benefit from greater sophistication to support more accurate personalised management of BC patients. It is now recognised that the biological characteristics of BC are important for clinical management and incorporation into the NPI could significantly improve the delivery of personalised medicine in BC patients.

Current data imply that BC is a heterogeneous group of diseases with complex and distinctive underlying molecular pathogenesis (Beckmann et al, 1997; Ellis et al, 1999; Lishman and Lakhani, 1999). Further support for this hypothesis is provided by gene expression profiling (GEP) that have identified distinct molecular tumour groups with direct clinical relevance (Perou et al, 1999; Sorlie et al, 2001; van de Vijver et al, 2002; van't Veer et al, 2003; Darb-Esfahani et al, 2009; Parker et al, 2009; Nielsen et al, 2010). While this provides further compelling evidence that tumour biology is a key variable required for decision making in personalised BC management, the heterogeneity within these groups and its incorporation with the currently validated variables and prognostic indices add complexity. There is also evidence that individual clinicopathologic prognostic factors behave differently in the different molecular subclasses; for instance tumour grade and size, which have a siginificant prognostic value in the luminal/oestrogen receptor (ER)-positive classes, show a limited prognostic power in HER2-positive (Foulkes et al, 2009, 2010) and basal-like tumours (Rakha et al, 2010).

Although available data support incorporation of GEP, particularly multigene assays, in specific clinical settings, the difficulty in the integration of the clinicopathologic variables with the molecular assays, the reproducibility and the cost limit the clinical utility of this technology. An alternative approach is to initially classify BC into distinct molecular classes using a panel of proteins with known relevance to BC utilising the robust commonplace technology, immunohistochemistry (IHC), applied to routine formalin-fixed paraffin-embedded tumour samples.

As previously reported (Abd El-Rehim et al, 2005; Soria et al, 2010; Green et al, 2013), seven core BC classes were identified by evaluation of the expression levels for a selective panel of 25 BC-related biomarkers determined using IHC and supervised classification approaches based on the naïve Bayes classification performance (Soria et al, 2010). To make this classification easily applicable in routine practice, the number of markers was further reduced and 10 markers were found to be the minimum number which is required to retain the classification. This formed the basis of the development of a fuzzy rule induction algorithm (using the methodology previously described in Rasmani et al, 2009) to classify the breast tumours into one of the seven classes (Abd El-Rehim et al, 2005; Soria et al, 2010; Green et al, 2013). The core molecular classes identified include three luminal classes tumour characterised by high luminal Ck7/8 and hormone receptor (HR) expression. Luminal A and Luminal B tumours show high expression of CK7/8, ER, HER3 and HER4 but are separated by lower levels of PgR expression in Luminal B. Luminal N tumours show differential expression of HER3 and HER4. The two basal classes of tumour, characterised by high basal expression, are separated by p53 protein expression levels: high p53 (Basal p53 altered) or low p53 (Basal p53 normal). The two HER2+ classes are characterised by HER2 overexpression and are either positive or negative for the expression of ER. These distinct molecular classes of BC showed a significant association with patient outcome.

In this study, we examine the hypothesis that molecular features of BC are a key driver of tumour behaviour and that the influence of established prognostic will vary between classes. As a consequence, application of established clinicopathologic prognostic variables will require development of bespoke prognostic indices to improve prediction of both clinical outcome and relevant therapeutic options. To examine this hypothesis, a comprehensive panel of biomarkers with relevance to BC, described above, has been applied to a large and well-characterised series of BC, using IHC and different multivariate clustering techniques, to identify the key molecular classes, phase 1 of NPI+ classification. Subsequently, each class was further stratified using a set of well-defined prognostic clinicopathologic variables. These variables were combined in bespoke formulae to prognostically stratify different molecular classes, NPI+ classification phase 2. Thus, NPI+ is based on a two tier evaluation; the initial assement determines the biological class of the tumour and is subsequently combined with a second-level analysis of traditional clinicopathologic prognostic variables resulting in tailored (bespoke) NPI-like formulae for each biological class.

Patients and methods

Patients

A series of 1073 patients from the Nottingham Tenovus Primary Breast Carcinoma Series, aged 70 years or less, presenting with primary operable (stages I, II and III) invasive BC between 1986 and 1998 were used. This is a well-characterised consecutive series of patients who were uniformly treated according to standard clinical protocols (Abd El-Rehim et al, 2005; Rakha et al, 2008). All tumours were <5 cm diameter on clinical/pre-operative measurement and/or on operative histology (pT1 and pT2). Women aged over 70 years were not included because of the increased confounding factor of death from other causes and because primary treatment protocols for these patients often differed from those for younger women. Adjuvant systemic therapies were offered according to the NPI (Galea et al, 1992) and HR status (Galea et al, 1992). The NPI was calculated using the following formula: NPI=histological grade (1–3) (Rakha et al, 2008)+LN stage (1–3; 1=negative, 2=1–3 nodes positive, 3=⩾4 nodes positive)+(tumour size/cm × 0.2). No systemic therapy was offered to patients in the good prognostic groups (NPI ⩽3.4). Patients in the moderate I group (NPI 3.41–4.4) with HR-positive tumours were offered hormonal therapy. Patients in the moderate II (4.41–5.4) and poor (NPI >5.41) groups received hormone therapy for HR-positive tumours and cytotoxic therapy (classical cyclophosphamide, methotrexate and 5-fluorouracil (CMF)) for HR-negative tumours if the patient was fit enough to tolerate chemotherapy. Hormonal therapy was given to 420 patients (39.0%) and chemotherapy to 264 (24.5%). Data relating to survival were collated in a prospective manner for those patients presenting after 1989 only. Breast cancer-specific survival (BCSS) was defined as the interval between the operation and the death from BC, death being scored as an event, and patients who died from other causes or were still alive were censored at the time of last follow-up. This study was approved by the Nottingham Research Ethics Committee 2 under the title ‘Development of a molecular genetic classification of breast cancer’.

Biomarker assay

Immunohistochemical reactivity for 10 proteins, with known relevance in BC including those used in routine clinical practice, was previously determined using standard immunocytochemical techniques on tumour samples prepared as tissue microarrays (TMAs). These markers were chosen from a comprehensive panel of 25 markers used in our previously study (Abd El-Rehim et al, 2005) as the minimum number of markers that can maintain class membership and identify the same molecular classes (Green et al, 2013). The biomarkers used for classification were ER, progesterone receptor (PgR), cytokeratin (CK) 5/6, CK7/8, epidermal growth factor receptor (EGFR; HER1), c-erbB2 (HER2), c-erbB3 (HER3), c-erbB4 (HER4), p53 and Mucin 1. Levels of immunohistochemical reactivity were determined by microscopic analysis using the modified Histochemical score (H-score), giving a semiquantitative assessment of both the intensity of staining and the percentage of positive cells (values between 0 and 300) (McCarty et al, 1985; Goulding et al, 1995). For HER2, the American Society of Clinical Oncology/College of American Pathologists Guidelines Recommendations for HER2 Testing in Breast Cancer were used for assessment (Wolff et al, 2007). Equivocal (2+) cases were confirmed by CISH as previously described (Garcia-Caballero et al, 2010).

Identification of biological class

As previously reported (Soria et al, 2010), six core BC classes, with an additional unclassifiable class, were obtained using a consensus clustering approach between different clustering methods. Briefly, four-step methodology for elucidating core, stable classes of data from a complex, multidimensional data set was as follows: (1) A variety of clustering algorithms were run on the data set including Hierarchical, K-means, Partitioning around medoids, Adaptive resonance theory and Fuzzy c-means. (2) Where appropriate, the most appropriate number of clusters was investigated by means of cluster validity indices. (3) Concordance between clusters, assessed both visually and statistically, was used to guide the formation of stable ‘core’ classes of data. (4) A variety of methods were utilised to characterise the elucidated core classes. Concordance among solutions was evaluated using the Cohen’s kappa coefficient k. For inspection of the patient characteristics in each class, the distribution of each variable in the class was compared with its distribution in the total sample, using boxplots. A conventional multilayer perceptron artificial neural network model was utilised such that individual H-scores derived from the TMA analysis of the clinical samples were set as inputs and the class was set as the output using Boolean notation. This allowed the identification of markers that drive membership of a given class and that discriminate the class from the others.

Three luminal subgroups (Luminal-A (no=370), Luminal-N (no=146) and Luminal-B (no=123)), two basal classes (Basal p53 altered (no=126) and Basal p53 normal (no=87)) and a HER2-positive class (HER2+ (no=145)) were highlighted. In a subsequent study (Green et al, 2013), the HER2+ class was divided into two subgroups (HER2+/ER+ (no=60) and HER2+/ER− (no=85)).

Development of NPI formulae for each biological class

A Cox regression analysis was performed for the overall population for a selection of available and well-established histopathologic prognostic factors. The variables tested were coded using a numerical categorical or continuous, depending on the variable, method. The variables included were number of positive nodes (N) (including nodal stage (St)), tumour size (Sz), tumour grade (including its components namely tubule formation, nuclear pleomorphism and mitotic index (M); Rakha et al, 2008), lymphovascular invasion (LVI), ER status, PgR status and HER2 status. The NPI formulae were used to determine the prognostic effect in each biological class. The NPI+ score is determined by utilisation of the β values generated by the COX regression. These β values indicate the magnitude of the influence of the hazard.

Survival analysis

After identification of the relevant parameters and their influence upon the prognostic model and within the context of each class, the individuals of the populations were assigned an NPI+ value to stratify them into different subgroups of prognostic relevance. In this preliminary work, the groups were assigned according to the integer value of the NPI+ score. This helped to ultimately stratify the cohorts of patients of each biological class in a Kaplan–Meier curve.

Results

Biological class

In this study, using a consensus clustering approach and a panel of routinely applicable immunohistochemical markers with relevance to BC, seven core molecular BC classes were identified. These classes included 370 patients in class 1 (Luminal A); 146 class 2 (Luminal N); 123 class 3 (Luminal B); 126 class 4 (Basal p53 altered); 87 class 5 (Basal p53 normal); 60 class 6 (HER2+/ER+) and 85 class 7 (HER2+/ER−) (Table 1).

Table 1 Clinicopathological parameters of the seven breast cancer biological classes

Development of NPI formulae for each class

After successive removal of the least significant (that is, with P-values above 0.2) parameters during different steps, the final factors with the most significant results, according to their β value in the Cox regression analysis, were identified. The proportional hazard ratio Cox regression identified six clinicopathologic prognostic factors of importance within the population in predicting BCSS: N (nodal number), St (stage), Sz (size), M (mitosis), LVI and PgR. Once these factors were identified, the population was split into the biological classes, as determined above, and Cox regression analyses were performed independently for each class to obtain the most significant clinicopathologic prognostic factors and their β value in the context of the classes. Kaplan–Meier analysis showed that using formulae, based on N, Sz, St, M, LVI and PgR, for each biological BC class provides improved and highly significant patient outcome stratification compared with the traditional NPI (Figure 1A–G). These variables were combined to form formulae that vary among different classes leading to bespoke NPI-like formulae for each of the seven biological classes forming a new biomarker-based prognostic index (NPI+). This NPI+ was then used to predict outcome (BCSS) in the different molecular classes and the NPI+ outcome prediction was compared with that achieved by the traditional NPI in each of the biological classes (Figure 1A–G). In addition to improved outcome prediction using NPI+ compared with the traditional NPI in each class, NPI+ provided more clinically relevant stratification with splitting of each class into two or three groups compared with the six classes of NPI.

Figure 1
figure 1figure 1

Patient stratification with the classic NPI (left) compared with NPI+ (right) in each of the biological classes. (A) Class 1 Luminal A, (B) class 2 Luminal N, (C) class 3 Luminal B, (D) class 4 Basal p53 altered (E) class 5 Basal p53 normal (F) class 6 HER2+/ER+ and (G) class 7 HER2+/ER−. Abbreviations: GPG=good prognostic group; M1 and M2=moderate prognostic groups 1 and 2; PPG=poor prognostic group; VPPG=very poor prognostic group. Time is shown in months.

Prediction of adjuvant therapy benefit

Although the current study is not derived from a randomised clinical trial sample, the cases were stratified based on the adjuvant systemic therapy in an attempt to assess the potential value of using NPI+ to predict outcome in the different classes. The number of the patients receiving either endocrine therapy or chemotherapy in each of the NPI+ classes is summarised in Table 2. When the cohort was stratified according to systemic therapy, NPI+ was found to predict good vs adverse outcome for all of the biological classes in both hormone therapy-treated patients (Figure 2) and chemotherapy-treated patients (Figure 3) with the exception of chemotherapy benefit in class 5 (Basal p53 normal) in which few deaths were observed in the group as a whole (Figure 3C). This approach is superior to use of the traditional NPI (Figure 4), which provides overall patient stratification but lacks similar ability to predict adverse outcome effectively in specific molecular classes.

Table 2 The number of patients in NPI+ classes receiving adjuvant systemic therapy
Figure 2
figure 2

Stratification using NPI+ of those patients who received adjuvant chemotherapy in HER2-positive (regardless of ER expression; left) and HER2-positive ER-negative (right) classes. Time is shown in months.

Figure 3
figure 3

Stratification using NPI+ of those patients in the various classes who received adjuvant chemotherapy for groups ( A ) luminal (1+2+3), ( B ) basal (4+5), ( C ) two basal groups—patients with Basal p53 normal tumours had a good survival overall and no additional stratification could be achieved. (D) HER2 (6+7), (E) HER2+/ER+ and (F) ER+/ER−. There were too few luminal cases receiving chemotherapy to allow development of NPI+ formulae for each of the luminal groups. It can be seen that NPI+ identifies patients with favourable vs poor outcome in all classes assessable apart from the Basal p53 normal class. Time is shown in months.

Figure 4
figure 4

Survival for each of the classic NPI groups in the whole patient set. Time is shown in months.

Discussion

Improved tailoring of treatment for BC requires integration of clinical pathologic and cancer biological information to ensure all known variables that could potentially influence patient outcome and response to therapeutic treatments are considered. Subsequently, there has been increasing interest in the clinical utility of multigene assays, such as the Oncotype DX (Paik et al, 2004) and the MammaPrint test (van't Veer et al, 2002), and their integration into BC management strategies in certain clinical settings. Although the concept of molecular taxonomy of BC using global GEP has attracted attention of the scientific community, their incorporation into routine clinical decision making did not prove successful for a number of reasons. These include cost, reproducibility, validation and lack of suitability to routine clinical settings.

Previously, we (Abd El-Rehim et al, 2005; Green et al, 2013) and others (Callagy et al, 2003; Ambrogi et al, 2006) have used IHC and TMA technology to develop a proposal for a modern molecular classification of human BC comparable to that produced by gene expression microarrays. By way of contrast, our methodology is expected to provide not only a simple and cost-effective approach but also a robust, feasible and reproducible method for BC risk stratification. In this study, we hypothesised that the combination of molecular taxonomy using a panel of IHC biomarkers with traditional prognostic clinicopathologic variables can produce a ‘state of the art’ approach of risk stratification in a reproducible and balanced way. This approach is driven by our improved understanding of BC biology and its impact on tumour behaviour and response to therapy, in addition to the expanding field of systemic and targeted therapy and subsequent difficulty in predicting outcome in these complex circumstances.

Initially, the combined protein expression profiles of 25 well-characterised biologically relevant biomarkers were assessed. These included proteins involved in different cellular functions and disease pathways including cell proliferation, adhesion, signal transduction and structural proteins. Using a consensus of clustering methodology and modelling techniques, we have developed a clinically based classification of BC based on 10 biomarkers (Green et al, 2013). This has confirmed seven classes, comparable to those identified with gene expression analysis. These key biological phenotypes of BC can be identified using standard, widely available IHC technology and are associated with significantly different patient outcomes. Also of importance is the observation that 93% of BC cases clearly exhibit core class membership criteria, while only 7% remain unclassified. In addition, we believe that using this classification system provides a reflection of the complex molecular portrait of BC more than that could be obtained using the three marker panel (ER, PgR and HER2) assessed in routine practice. In the second phase, we used the existing clinicopathologic variables to stratify each biological class into clinically distinct subgroups using bespoke NPI-like formulae, known as the NPI+. The parameters used for the NPI+ for each of the seven core molecular class are not only different for each class but also incorporate additional well-validated variables such as LVI (Rakha et al, 2012) and PgR (Prat et al, 2013) that were not considered in the generation of the traditional NPI. The use of such formulae not only overcomes problems associated with the variable prognostic power of each individual clinicopathologic factors in the different molecular classes but also provides a way of incorporating biological and clinical variables in a scientifically and clinically relevant way (Dunkler et al, 2007).

The aim of the study is not revolution but evolution; the aim is not to replace current proven and established methods but to build on and improve the current prognostic methods by combining the well-established powerful clinicopathologic variables with novel biomarker information. In developing our prognostic toolkit, we aimed to provide an assay compatible with routinely processed formalin-fixed paraffin-embedded tissue, offering a level of sensitivity and predictive capabilities far better and more sophisticated than present classification systems. Our results demonstrate that NPI+ not only provides prognostic information with consideration of biological features but NPI+ also performs better than traditional NPI and can subsequently help guiding treatment decision in a personalised manner (Table 3). We believe NPI+ combines the breadth of Adjuvant Online! (Olivotto et al, 2005) with greater direct clinical validation, while also having depth of clinical and biological relevance of the current commercial solutions. The NPI+ will enable a more sophisticated personalised treatment tool for BC patients by providing: (1) improved prognostic analysis, (2) predict risk of disease recurrence, (3) provide health economic savings through appropriate targeting of treatment, (4) NPI+ uses routine clinical samples and robust laboratory methods integrating easily into current international clinical practice.

Table 3 Comparison of the traditional NPI with NPI+

This study however has limitations. Therapy decisions of the current study patients’ cohort were based on similar prognostic markers at the time point of the first diagnosis with the potential of adjuvant therapy confounding effect. There may be an underrepresentation of cases without chemotherapy in the group of patients with a low initial NPI (good prognostic group), likewise the group of patients without a chemotherapy in patients with a high initial NPI (poor prognostic group) is rather low. This is a recognised limitation of prognostic marker and risk stratifiers assessment in the current era in which depriving patients from adjuvant treatment cannot be ethically justified. Initial results of this study indicate a prognostic value when patients were stratified based on the systemic therapy. Currently, additional cohorts are being tested to provide sufficient number in the different treatment subgroups. Although phenotypic classification into core luminal, basal and HER2 classes is possible using smaller panels of three to five antibodies (Carey et al, 2006; Cheang et al, 2008), such limited panels cannot further subclassify these core groups. Our study clearly demonstrates that using a larger panel of 10 biomarkers a higher level of stratification is achieved that may have direct and important clinical relevance.

In conclusion, this study provides proof-of-principle evidence for the development of a novel prognostic index (NPI+) that combines both established clinicopathologic and biological features of BC. Validation in different national and international tumour series is currently underway. Furthermore, its clinical utility and impact on health economics will be typically assessed in a prospective randomised clinical trial.