Main

The majority of patients with early breast cancer are recommended adjuvant systemic therapy after primary surgery to reduce risk of breast cancer recurrence and to increase the likelihood of cure. There is demonstrable benefit in the addition of adjuvant chemotherapy to endocrine therapy in women with oestrogen receptor (ER)-positive early breast cancer Early Breast Cancer Trialists' Collaborative Group (2005). However, patient selection is important because some women may be successfully treated with endocrine therapy alone, and spared the potentially serious and quality of life diminishing toxicities of chemotherapy. The proportion of patients in this group has increased markedly in recent years owing to the earlier diagnosis of breast cancer and the better prognosis disease that is associated with early detection. Endocrine therapy has side effects, but is comparatively well tolerated. The difficulty is in accurately stratifying which women with early ER-positive, HER2-negative breast cancer can be spared from chemotherapy. Combinations of prognostic and predictive clinicopathologic factors have been developed into tools that allocate patients into risk categories with the goal of improved stratification.

The Nottingham Prognostic Index (NPI) is one of the earliest tools to be developed (first presented in 1982, Haybittle et al, 1982; Lee and Ellis, 2008). It was designed through retrospective multivariate regression analysis in a study of 387 women with primary, operable breast cancer and has been independently and prospectively validated (Balslev et al, 1994; D'Eredita et al, 2001; Blamey et al, 2007). It uses tumour size, grade and nodal burden to classify patients into groups as good, moderate and poor prognosis, with NPI <3.4, 3.4–5.4 and >5.4, respectively. The original 15 year overall survival estimates for each prognostic group of 80%, 42% and 13%, respectively, were based on data from the 1980s (Galea et al, 1992). Updated survival estimates from the 1990s have been published together with additional divisions of the prognosis groups (Table 1) (Blamey et al, 2007).

Table 1 NPI 10-year breast cancer-specific survival (1990–1999; Blamey et al, 2007)

Adjuvant! Online (AoL) is a popular prognostic and predictive tool freely available on the Internet (http://www.adjuvantonline.com). It uses the same factors as the NPI as well as ER status, age and comorbidity to project the likelihood of mortality and disease recurrence at 10 years, and the magnitude of benefit to be gained by adjuvant therapy for individual patients (Ravdin et al, 2001). The prognostic information is predominantly based on data from the Surveillance, Epidemiology and End-Results registry (http://www.seer.cancer.gov), and the projections of efficacy of adjuvant therapy are based on data from the Early Breast Cancer Trialists’ Collaborative Group overviews of randomised clinical trials (Early Breast Cancer Trialists’ Collaborative Group, 1998a, 1998b, 2005). It has been independently validated using a large, population-based database (Olivotto et al, 2005). Its ease of use and the personalised information it generates, quantitatively estimating prognosis with and without adjuvant therapy, are reasons for its popularity.

An important recent focus for translational researchers has been towards discovery of molecular markers that better predict individual residual risk after endocrine therapy in patients with ER-positive disease. A number of assays incorporating multiple molecular markers have been developed chiefly for this purpose. The Oncotype Dx or 21-gene recurrence score (RS) (Paik et al, 2004) is a multi-gene assay that tests for overexpression of 21 genes with reverse-transcriptase PCR (Paik et al, 2004). An algorithm is used to calculate the RS and determine risk category (low: <18, intermediate: 18–<31 or high: 31). The RS has been validated in ER-positive breast cancer; in node-negative and node-positive populations, and in tamoxifen-treated, AI-treated and chemotherapy-treated populations (Habel et al, 2006; Paik et al, 2006; Goldstein et al, 2008; Albain et al, 2010; Dowsett et al, 2010). Prospective validation of the predicted benefit from chemotherapy in the higher RS patients is taking place in the TAILORx trial (Zujewski and Kamin, 2008). The almost complete independence of the molecular risk information from RS from the clinicopathologic information encompassed in AoL means that more accurate risk estimates can be made by integrating the two types of data (Goldstein et al, 2008; Tang et al, 2011).

The immunohistochemical (IHC) 4+C score is a prognostic tool based on quantitative values of four standard laboratory assays (ER, PR, HER2 and Ki67), and the clinicopathologic parameters of tumour grade, size, nodal burden, patient age, and treatment with AI or tamoxifen (hence IHC4+clinical (IHC4+C) score) (Cuzick et al, 2011). The IHC4+C score was developed in a retrospective analysis from TransATAC after recognition that these IHC assays independently hold prognostic power in endocrine-treated patients (Dowsett et al, 2008; Dowsett et al, 2011b). IHC4+C gives a prediction of the residual risk of distant recurrence at 9 years in postmenopausal women with node-negative, hormone receptor-positive disease treated with 5 years of adjuvant endocrine therapy (Cuzick et al, 2011). The score’s performance was found to contain a comparable degree of prognostic information to the RS using the TransATAC data set, and was validated in an independent data set. Further assessment to establish broader applicability was advised (Cuzick et al, 2011).

Like many other institutions, we use AoL to assist in decision-making concerning adjuvant systemic therapy. For women with ER-positive breast cancer with good or intermediate prognosis, we calculate the predicted benefit of chemotherapy in addition to endocrine therapy and use this in making treatment recommendations. The objective of this study was to compare the prognostic information gained from IHC4+C with that from the AoL and NPI, in a group of postmenopausal women with early breast cancer receiving treatment at our institution, to determine whether assessment of IHC4+C assists in the selection of patients who may be safely treated with adjuvant hormone therapy alone and spared the side-effects of chemotherapy.

Patients and Methods

Patient population

We prospectively recorded post-operative clinicopathologic data for women with early breast cancer consecutively presenting to our single institution during 1 year (February 2010 to February 2011). Patients were included if they had hormone receptor-positive, HER2-negative early breast cancer that was either node-negative or pathological stage N1 (1–3 axillary lymph nodes containing macrometastases (deposit 2 mm)), and had undergone surgery to remove the breast tumour, with tumour-free surgical margins. Patients receiving neoadjuvant therapy were excluded. Patients with axillary lymph nodes containing micrometastases were classified as not involved. HER2-negative was defined as being HER2 0 or 1+ on IHC staining or IHC2+, and either chromogenic or fluorescence in-situ-hybridisation (CISH or FISH) negative (Wolff et al, 2007). Only postmenopausal women were included, because the IHC4-C score has not yet been fully validated in premenopausal women. Postmenopausal was defined as women 60 years of age or 50, and amenorrhoeic for 12 months in the absence of hysterectomy or intrauterine contraceptive device (IUCD) or endocrine therapy, or women who had undergone bilateral oophorectomy, or in women aged 50–59 whose menopausal status was indeterminate such as because of hysterectomy or IUCD, in whom there was biochemical evidence (elevated FSH and suppressed oestradiol) of menopause. Patient age was recorded at the time when the decision on the patient’s adjuvant management was made. Women over the age of 75 were excluded. Women who had incomplete axillary staging or no axillary staging surgery performed were excluded, as were women with bilateral breast cancer, due to anticipated unreliable prognostic estimations of both IHC4+C and AoL in these settings.

Study design

The study was a retrospective comparison of the AoL, NPI and IHC4+C score, to determine whether the IHC4+C contributed to decision-making concerning adjuvant therapies. Clinicopathologic data was collected prospectively over the 1-year period, after which the IHC4+C score, AoL and the NPI were retrospectively compared. The study was approved by our institution’s clinical audit committee.

Study endpoints

The primary endpoint was the proportion of patients reallocated from AoL-defined intermediate risk of distant recurrence at 10 years, to either high or low risk, by application of the IHC4+C score. The first secondary endpoint was the proportion of patients reallocated from NPI-defined moderate risk to either high or low risk, by application of the IHC4+C score. Other secondary endpoints were correlation between AoL and IHC4+C, correlation between the NPI and IHC4+C, and correlation between the NPI and AoL.

The immunohistochemical (IHC)4+C

We incorporated standard IHC tests on formalin-fixed paraffin-embedded breast tumour tissue in combination with clinicopathologic factors of tumour grade, size, nodal burden, patient age and AI treatment in calculating the IHC4+C score for each patient, using the same laboratory methods that were described by the TransATAC research group in deriving the IHC4 other than for Ki67, where the MIB1 antibody was used instead of SP6 after establishing their close similarity (Zabaglo et al, 2010; Cuzick et al, 2011). The ER was quantified by the H-score, which is defined as the percentage of cells showing weak IHC staining, added to two times the percentage of cells staining moderately, added to three times the percentage of cells staining intensely that was then divided by 30 to arrive at variable between 0–10 (ER10). An H-score of more than 1 is positive. PgR was quantified by the percentage of cells staining positive, and this was divided by 10 to obtain a variable between 0 and 10 (PgR10). HER2 assessment was given as either a positive (IHC3+ or IHC2+ and CISH or FISH positive) or negative result (IHC0 or 1+ or 2+ and CISH or FISH negative). Ki67 was quantified as the percentage of positively staining cancer cells, using the MIB1 antibody (Zabaglo et al, 2010).

The IHC4 score was calculated by the algorithm:

As was described in the published methods for assessing IHC4 in the independent validation cohort (Cuzick et al, 2011), we also multiplied Ki67 by 0.4 to scale for the difference in scoring technique. This is because Ki67 scores are on average 2.5 times higher with manual reading than using the image analysis method from which this algorithm was derived.

The clinical score was calculated by the published algorithm:

with Nj, Tj, Grj and Agej representing categories of nodal status, tumour size, grade and patient age, respectively, and Ana representing treatment with anastrozole as opposed to tamoxifen. The IHC4+C score was the sum of the IHC4 and the clinical score.

Adjuvant! Online

We prospectively calculated AoL predictions of mortality and recurrence for each patient and entered this into our database. We collected information regarding patient comorbidity, but owing to the large differences in breast cancer-specific mortality predicted by AoL when conditioned for competing mortality risk and also because neither the IHC4+C score nor NPI corrects for competing mortality risk, we decided a priori to use an assumption of perfect health for all patients. We used ‘adjuvant AI for 5 years’ in assessing the predicted effectiveness of adjuvant endocrine therapy. We took the prediction of death from breast cancer at 10 years and subtracted the additional benefit conferred by 5 years adjuvant endocrine therapy to calculate the residual risk of breast cancer mortality at 10 years despite adjuvant endocrine therapy.

Nottingham Prognostic Index

The NPI was calculated for each patient postoperatively, and was based on operative pathological findings. The NPI score is 0.2 × tumour size (cm)+grade+nodal status (pN0=1, pN1-3=2, pN4=3). Patients with an NPI <3.4 are considered to be at low risk, 3.4–5.4 at intermediate risk and >5.4 at high risk.

Statistical analysis

The 9-year residual distant recurrence risk after adjuvant endocrine therapy estimate generated by IHC4+C score was corrected to 10 years (assuming a constant recurrence event rate). The 10-year breast cancer-specific mortality after endocrine therapy that was generated from AoL was multiplied by 1.25, to arrive at a figure representing 10-year residual distant breast cancer recurrence risk after adjuvant endocrine therapy. The multiplier of 1.25 was derived through comparison of distant recurrence and breast cancer-specific survival outcomes in patients treated with 5 years of adjuvant tamoxifen or anastrozole monotherapy in the ATAC study, where the number of distant recurrence events at 10 years was 25% greater than the number of breast cancer-specific mortality events at 10 years (Cuzick et al, 2010). The distant recurrence risk at 10 years after adjuvant endocrine therapy was chosen so the two tools can be directly compared. It also correlates with RS and meant that similar cutoffs between risk groups used could be applied to our results. We used cutoffs of 10 and 20% risk of distant recurrence at 10 years because these correspond with the 18 and 30 RS risk of distant recurrence at 10 years that is used to classify patients as being at intermediate risk, which has been used in previous comparisons (Goldstein et al, 2008; Dowsett et al, 2010; Tang et al, 2011).

The relationship between AoL and IHC4+C was assessed on a continuous basis graphically and by Spearman’s Rank correlation. The Stuart–Maxwell test of marginal homogeneity was used to test for significant differences between the prognostic scores using Stat 11.2 for windows. All patients meeting eligibility criteria were included in these analyses. The planned sample size was 100 patients, which was estimated to be a representative number and attainable within our single institution. It also meant that an odds ratio of approximately three could be detected, assuming there were 40% discordant pairs.

Results

Patient and disease characteristics

Two hundred and four patients were considered potentially eligible for the study on the basis of having hormone receptor-positive, HER2-negative early breast cancer and 3 axillary lymph nodes containing metastases. In all, 84 patients were excluded due to clinical eligibility factors (5 had bilateral breast cancer, 2 were male, 15 were aged >75 years, 59 were premenopausal, 3 had unstaged axillary lymph nodes). One hundred and twenty patients met eligibility criteria for the study. In all, 19 patients were not able to be included due to difficulty in obtaining tissue from a different centre or IHC4+C processing difficulty, leaving a total of 101 evaluable patients. The patient demographic and clinicopathologic characteristics are summarised in Table 2. The median patient age was 63, and the median tumour size was 20 mm.

Table 2 Demographic and clinicopathologic characteristics of the study population

Comparison between AoL score and IHC4+C score

Figure 1 shows the agreement between AoL and IHC4+C estimates of risk of distant recurrence at 10 years. The Spearman’s Rank correlation was 0.84. Overall AoL rated patients at higher risk than IHC4+C. Agreement between the categories created for the IHC4+C score and the AoL score was 68.3% (95% CI: 58.3–77.2)(Table 3). Within the AoL-defined intermediate-risk group, 58% of patients were reclassified by application of the IHC4+C. Risk was lower by IHC4+C, with 15/26 patients reclassified as low risk, 11/26 remaining classified as intermediate risk and no patient reclassified as high risk. IHC4+C also frequently restratified patients from an AoL-defined high-risk group to a low-risk group. Forty-seven percent of the patients in the AoL-defined high-risk group were downgraded; 13/32 to intermediate-risk group and 3/32 to low-risk group by the use of IHC4+C. In contrast, the risk stratification was found to be highly consistent between the low-risk groups, with 41/43 patients stratified by AoL as low-risk being also stratified as low-risk by IHC4+C. Only two patients were reclassified from low to intermediate-risk by the use of IHC4+C, and none were reclassified to high-risk.

Figure 1
figure 1

Spearman’s Rank correlation between AoL and IHC4+C estimates of risk of distant recurrence at 10 years.

Table 3 Comparison of IHC4+C score with Adjuvant! Online in assessing residual risk of distant recurrence at 10 years after adjuvant endocrine therapy

Comparison between the NPI and IHC4+C score

The correlation between the IHC4+C and NPI score risk groups was 60.4% (95% CI: 50.2–70.0; Table 4). The IHC4+C restratified the majority of the patients (37/59) in the NPI-defined intermediate-risk group to either a high- or low-risk group. Just over one-third of these patients (24/59) were moved from an intermediate to a low-risk group, 13/59 were moved to high-risk and 22/59 remained in the intermediate-risk group. Only 3/38 patients in the NPI low-risk group were reclassified by IHC4+C as intermediate-risk and none as high-risk. Only four patients were classified as high-risk group by NPI as none of our patients had 4 nodes positive and few had grade 3 tumours, and in these circumstances only a very large tumour gives an NPI of >5.4 (high risk).

Table 4 Comparison of the IHC4+C score with the NPI in assessing risk of breast cancer distant recurrence

Correlation between AoL and NPI score

There was moderate correlation between the AoL and NPI at 65.3% (95% CI: 55.2–74.5; Table 5). The predominant category where there was disagreement was in the NPI-defined intermediate-risk group, with 28/55 patients classified as intermediate-risk by NPI being reallocated to a high-risk group by AoL. Only four patients were classified as high-risk group by both NPI and AoL. The greatest agreement was seen in the low-risk group, with 37 patients classified low-risk by both NPI and AoL.

Table 5 Comparison of Adjuvant! Online with the NPI in assessing risk of distant recurrence

Discussion

We found in an oncology clinical practice setting that IHC4+C provides additional stratification regarding the residual risk of distant recurrence in ER+ primary breast cancer patients to receive adjuvant endocrine therapy, supplementary to that provided by the AoL and NPI intermediate-risk groups. The IHC4+C score downgraded more than half of the patients in the AoL-defined intermediate-risk group to a low-risk group. Nearly two-thirds in the NPI-defined intermediate-risk group were reallocated into either a low- or high-risk group, with risk stratification most often lowered. The risk category changes we observed indicate that the use of IHC4+C in the clinical setting would often lead to a change in adjuvant treatment decisions with an overall increase in the number of patients who may safely be spared unnecessary adverse effects of chemotherapy. This extra information based on biological phenotype would be valuable for decision-making concerning adjuvant chemotherapy in a substantial proportion of patients.

The correlation between two different tools has been evaluated previously to compare prognostic and predictive efficacy, and to determine whether one tool adds independent information to the other. Our results are similar to that seen when other risk-projecting tools have been equated with another. For instance, comparisons of Oncotype Dx with AoL have demonstrated that these prognostic tools each provide discrete prognostic information, with relatively poor congruency (Goldstein et al, 2008; Dowsett et al, 2010; Tang et al, 2011). In spite of this, experimentation with combining these two prognostic tools into one integrated tool has not yet been shown to be useful, although more information was gained in one study by the integration of common clinicopathologic factors with Oncotype Dx (Tang et al, 2011). The fact that the information contributed by clinicopathologic factors was complementary to that obtained by molecular analysis alone is relevant to the present study, as the IHC4+C also combines surrogate information via IHC about the individual molecular profile of the tumour, and combines it with common clinicopathologic factors. The prognostic information provided by IHC4+C on top of that from clinicopathologic features has previously been shown to be at least as great as the additional information provided by Oncotype Dx (Cuzick et al, 2011).

Our method of deriving predicted 10-year breast cancer-specific survival from AoL is different from that used by two other groups of authors (Dowsett et al, 2010; Tang et al, 2011). They used the AoL estimate of breast cancer-specific mortality without competing risks at 10 years follow-up (shown as ‘10-year risk’), which is not corrected for comorbidity. The mortality projections made by AoL are otherwise markedly affected by comorbidity (Ozanne et al, 2009). However, we accounted for the discrepancy by using ‘perfect health’ as the comorbidity status of all patients. Therefore, the only varying difference between the ‘10-year risk’ figure and our estimate of breast cancer-specific mortality at 10 years is a small adjustment that AoL makes according to each patient’s age. We observed that the 10-year breast cancer-specific survival estimates we calculated were very similar to the ‘10-year risk’. Our method of adjusting the AoL breast cancer-specific survival estimate into a distant recurrence risk estimate, using data from the ATAC study to calculate a simple multiplication factor, was a pragmatic approach to enabling direct comparison of the prognostic tools. We acknowledge that there may be minor variance between true AoL risk estimate and the AoL risk estimate we generated as a consequence of using this method, but would anticipate this to be relatively trivial.

A major advantage of IHC4+C is its cost-effectiveness, being considerably less expensive than gene expression profiling tools such as Oncotype Dx. Another major advantage of IHC4+C over Oncotype Dx is that it uses existing laboratory assays, and in principle it could therefore be performed at the majority of oncology clinical centres internationally. On the other hand, there are quality assurance issues with qualitative assessment of ER, PR, HER2 and Ki67% IHC, with the potential for interlaboratory variation in values. Although international groups are making efforts to standardise these assays (Wolff et al, 2007; Hammond et al, 2010), there remains justifiable concern. Ki67% in particular has caused apprehension due to variable methods of assessment and heterogeneity in results between different laboratories. However, following a focused working group meeting in which investigators with expertise in assessment of Ki67% made recommendations aimed towards standardising Ki67 methodology and reducing interlaboratory variability; the Ki67% assay results are likely to become better synchronised between laboratories (Dowsett et al, 2011a). The IHC4+C score has been validated using a separate data set with IHC assays that were evaluated in a different laboratory. However, for the IHC4+C score to be validated comprehensively to ensure its reliability and reproducibility, it would need to be tested more widely, with assays conducted in several different laboratories (Cuzick et al, 2011). Our study adds to the literature with further evidence of Ki67 being reliably assayed with the results being utilised successfully in clinical practice.

Our results show that the IHC4+C score does not appear to be additionally useful in patients classified by AoL or the NPI to be of low-risk, as there was high correlation between the two scores in these categories, and no patients were moved from AoL or NPI-defined low-risk to a high-risk group by IHC4+C. IHC4+C holds predictive information as well as prognostic, as by estimating residual risk after endocrine therapy it is partially predicting endocrine therapy benefit. However, even those patients who fail to gain substantial benefit from endocrine therapy are likely to remain at relatively low risk of recurrence, such that adjuvant chemotherapy risk would probably outweigh benefit. These results indicate that patients assessed as low-risk by AoL or NPI are unlikely to benefit from the supplementary prognostic information provided by IHC4+C and should be recommended adjuvant endocrine therapy alone.

Judging by our findings, it is possible that IHC4+C has clinical utility in the AoL high-risk group. IHC4+C downgraded just less than half of the patients from a high-risk stratification by AoL to intermediate-risk or even, in a few, to a low-risk stratification. IHC4 correlates reasonably closely with Oncotype Dx (72%) (Cuzick et al, 2011), and for this reason it is likely to also be predictive of the magnitude of benefit from chemotherapy in the same way that Oncotype Dx is demonstrated to be (Paik et al, 2006; Albain et al, 2010). The additional information regarding risk stratification provided by IHC4+C for patients at AoL-defined high-risk, but intermediate or low-risk by IHC4+C, may be beneficial by predicting less chemosensitivity and/or greater benefit from endocrine therapy, which could contribute to decisions on adjuvant therapy.

It appears that AoL may overestimate risk compared with the IHC4+C and NPI. A similar phenomenon was noticed in a comparison of AoL with Oncotype Dx (Dowsett et al, 2010). A reason why AoL may predict a higher level of risk compared with IHC4+C is that it is based on clinical data that was mainly collected in the era preceding breast screening, and may disproportionately convey risk statistics associated with breast cancers that are biologically more aggressive, whereas IHC4+C is based on outcomes from a low-risk breast cancer population.

There was a low frequency of patients stratified as high-risk by NPI compared with AoL and IHC4+C. This is likely to be because our study population was confined to patients with 3 lymph nodes, and hormone receptor-positive disease (which is less often grade 3 than triple-negative breast cancer). In this context, tumour size must be large for NPI risk category to be high. IHC-4 and AoL are influenced by a greater number of variables, resulting in wider division of risk group allocation in this low- and intermediate-risk hormone receptor-positive population.

In conclusion, we found that IHC4+C assisted with risk stratification in the challenging group of patients with early breast cancer classified as being of intermediate-risk by the AoL and NPI. More than half of the patients classified as being of intermediate-risk by the AoL and NPI were reclassified by IHC4+C. This study is important for providing evidence that IHC4+C may be useful in guiding decision-making regarding adjuvant chemotherapy in an oncology clinical practice setting.