Background
Axillary lymph node (LN) status is the most important prognostic variable in the management of patients with primary estrogen receptor positive (ER+) breast cancer, which accounts for the majority of diagnosed cases. Node positive breast cancer patients have been shown to have a worse prognosis than those with node negative disease. These observations have led, in part, to the development of a Tumour Nodal Metastases (TNM) staging system that incorporates tumour size, nodal involvement, including the absolute number of involved nodes, and the presence or absence of systemic metastases into an incremental staging system [
1,
2]. Each stage of disease has specific survival characteristics and is thought to represent the natural progression of a tumour, from its origins in the breast to its metastasis through the lymphatic system to regional lymph nodes and ultimately through the circulatory system to distant sites. Clinicians use the TNM staging system to guide the management of breast cancer patients. Most breast cancer patients with involved axillary lymph nodes, in the absence of significant co-morbidities, are currently offered adjuvant systemic chemotherapy [
3,
4].
However, the biological significance of nodal metastases is poorly understood. It is hypothesised that involvement of axillary lymph nodes is an indicator of tumour chronology such that the longer a tumour has been growing in the breast the more likely it is to metastasize to regional axillary nodes. Furthermore, it is thought that breast cancers first metastasize to these nodes and then secondarily to other sites [
5,
6]. In support of this hypothesis, there is an established correlation between larger tumour size and lymph node involvement; indeed more timely intervention and resection of smaller primary tumours is associated with a reduced incidence of spread to regional lymph nodes [
7]. More importantly, the absence of lymph node involvement is significantly associated with a better prognosis.
An alternative hypothesis suggests that some metastatic tumours avoid the lymphatic system, and instead spread primarily through the circulatory system [
8,
9]. The evidence for this theory stems from the knowledge that 30 % of patients who are lymph node negative (LN-) at diagnosis will eventually succumb to metastatic breast disease, even after optimal treatment [
10]. Conversely, there is a subset of patients who present with lymph node positive (LN+) disease that never develop distant recurrence, even in the absence of adjuvant treatment [
9,
11]. It is likely that the biology of a primary tumour at diagnosis contributes to whether it remains at the primary site, spreads to regional lymph nodes, or metastasizes to distant sites via lymph node spread or through the vascular circulation. It is increasingly recognised that clinical pathological factors alone are limited in their ability to predict who will develop recurrent cancer or respond to treatment. To this end, a number of genomic signatures have been developed which have shown to be both prognostic (predict risk of distant recurrence) and predictive (predict response to chemotherapy) [
12,
13]. It is thought that these signatures detect biological differences in primary tumours indicative of whether a tumour is likely to metastasize.
Here, we explore the relationship between stage and tumour biology to outcome in ER+ breast cancer, in the context of prognostic gene signatures, namely Oncotype DX and Prosigna [
14‐
17]. Specifically, we compared the capacity of Oncotype DX, developed exclusively on and for LN negative (LN-) ER+ patients [
17], and Prosigna, developed on all clinical subtypes of breast cancer including those with and without lymph involvement [
18], for their capacity to predict outcome in patients with ER+/LN- and ER+/LN+ tumours. Furthermore, we examine the biological pathways represented in patient tumours with and without LN involvement that have good survival versus those that have developed systemic metastases. Finally, using this knowledge, a novel prognostic gene signature, called ‘Ellen’ was developed
in silico for both LN+ and LN- ER+ breast cancer.
Discussion
Lymph node status is the most prognostic variable for determining outcome in patients with ER+ breast cancer. However, it is unknown whether lymph node involvement is simply an indication of tumour progression over time or whether a primary tumour’s ability to metastasize is pre-determined by tumour biology. Gene signatures are an attractive option to predict outcome and several have been validated for use on ER+ breast cancer patients. Oncotype DX is a prognostic (and predictive) gene signature developed and validated using ER+ LN- tumours exclusively, whereas the development of the Prosigna gene signature included LN+ tumour samples. We wanted to examine the performance of Oncotype DX and Prosigna on LN+ patients and hypothesized that if lymph node involvement is merely a function of tumour progression, then the signatures developed using LN- patient samples (Oncotype DX) should similarly be able to predict outcome for LN + patients.
The Oncotype DX signature was developed using weighted averages of 16 genes (excluding housekeeping genes) known to be associated with outcome in ER+ LN- breast cancer using a qRT-PCR platform [
17]. This 21 gene signature has been validated and FDA approved for its ability to predict outcome in an independent cohort of ER+ LN- breast cancer patients [
34,
35]. We simulated the Oncotype DX algorithm
in silico using Affymetrix gene expression data and tested the prognostic ability of the simulated algorithm on ER+ tumours from LN+ and LN- patients. As expected, the simulated Oncotype DX algorithm was able to significantly predict outcome for ER+ LN- patients, confirming its prognostic capacity in this group of patients and supporting the validity of our
in silico approach to assess Oncotype DX performance. Furthermore, the
in silico approach we utilized has been used by others to compare gene expression data from different platforms including qRT-PCR and expression microarrays and to simulate gene signatures such as Oncotype DX and Prosigna [
24,
29‐
31,
33‐
35].
In our
in silico study, Oncotype DX was unable to significantly predict risk of recurrence for ER+ LN+ patients (Fig.
1 and Table
3), suggesting that a signature such as Oncotype DX, developed and validated on ER+ LN- patients, is not optimal for predicting outcome in ER+ LN+ patients. We cannot exclude the possibility that there is a subset of LN+ patients for whom Oncotype DX might be an appropriate prognostic assay, but further exploration in this area is needed. As such, there are several ongoing clinical trials, including SWOG S1007 and RxPONDER aimed at validating the prognostic utility of Oncotype DX for ER+ breast cancer patients with limited LN+ disease, the results from these studies are eagerly awaited [
36,
37].
Prosigna was approved as a prognostic assay for distant metastasis-free survival for patients with ER+ disease with 0–3 positive lymph nodes. The 50 disease associated-genes comprising the Prosigna assay were derived from the intrinsic molecular subtype signatures discovered in 2000 [
18,
38]; both LN- and LN+ breast cancer samples were used to develop and validate the Prosigna assay ([
39], TransATAC and ABCSG8 clinical trials). The simulated Prosigna signature, described here was able to significantly predict outcome for ER+ LN- and LN+ patients separately. This suggests that including LN+ patient samples in signature development will improve signature performance when applied to LN+ patient tumour samples.
The Ellen signature, which was developed using both LN- and LN+ patients, was able to more significantly predict outcome of LN- and LN+ cohorts than either the Oncotype DX or Prosigna gene signatures. It is possible that the increased significance, concordance, and hazard ratios derived from the Ellen signature are related to it being both trained and validated using Affymetrix data and we recognize that our results need to be validated using an independent cohort of patients. Alternatively, the increased significance of Ellen could be reflective of the importance of the biological processes, represented by the signature genes, to outcome in ER+ breast cancer. As detailed in Table
5, Ellen, Oncotype DX, and Prosigna signatures each represent common biological processes including: gene expression, proliferation, immune response, cell migration, cell cycle, and PTM and Trafficking. However, genes related to angiogenesis and epigenetics are unique to Ellen. Both of these processes have been demonstrated to be important for outcome in ER+ breast cancer [
6,
14,
33,
40‐
43]. Additional multivariable studies are being conducted, using an independent cohort of patients, to assess the relationship between these biological features and other clinical variables, including tumour size, grade, and histological subtype to validate the prognostic potential of Ellen.
Given that the three signatures examined performed with various levels of accuracy in LN+ and LN- patient populations, we were interested in exploring the biological processes that might be related to outcome in ER+ LN+ and LN- tumours separately, using GSEA. Patients with good outcome (irrespective of their original LN status) had tumours with expression profiles enriched for immune related genes (Tables
6 and
7). This was particularly striking for LN+ tumours where 6 of the 10 gene sets associated with good outcome were immune related. This enrichment of immune related gene sets may be indicative of immune cell infiltration in some tumours and suggests that a subset of ER+ breast cancer patients have a robust anti-tumour immune response and that this in turn may be associated with improved survival [
39,
44,
45].
We examined the ontology of genes comprising the Ellen signature to determine whether their functions overlap with those identified using the GSEA and found that 11 % of the Ellen genes are related to immune response. This further supports an important role for immune response in ER+ tumours and the utility of the signature. For example, we found that CXCL12 and JAK1 are both more highly expressed in low risk tumours. It has been reported that increased expression of CXCL12 is a strong positive prognostic factor that correlates with disease free and overall survival in both ER+ and ER- tumours [
46,
47]. JAK1 is a protein tyrosine kinase involved in the response to interferons; recently the closely related JAK2 family member was found to be associated with improved outcome in breast cancer [
48]. In addition, the expression of HLA-DPA1, which is normally expressed on antigen presenting cells, may indicate the presence of immune infiltrate [
49]. Overall, the presence of these immune related genes in low risk tumours indicates that immune response is an important factor in the progression of breast cancer.
Patients with poor outcome showed enrichment for different gene sets depending on whether their tumour was LN+ or LN- at diagnosis. For example, poor outcome LN- patient tumours were enriched for proliferation, growth factor signalling, and epigenetic modification gene sets, also represented by individual genes comprising the Ellen signature (Table
6). Proliferation in ER+ breast cancer is a poor prognostic factor and correlates with the Luminal B subtype [
39]. Epigenetic modification is thought to have some role in tumour progression, as global hypermethylation of the tumour genome has been associated with poor outcome [
50‐
52]. In addition there are several studies reporting that HDAC inhibitor usage may be useful as adjuvant chemotherapeutics in this high risk group [
53,
54]. Whereas, patients with LN+ disease and poor outcome had tumours enriched for EMT and migration suggesting a migratory phenotype [
9,
55].
Taken together, the different biological processes highlighted for LN- and LN+ groups may explain why gene signatures developed for one group would not necessarily be predictive of outcome in the other.
Conclusion
In summary, we have shown that by comparing Oncotype DX and Prosigna with a novel gene signature, it is important to include patients with both LN+ and LN- status when developing prognostic gene signatures. Furthermore, we have identified candidate biological processes that imply how tumour biology can be related to outcome. This is particularly evident for LN+ tumours with good outcome, where there is enrichment in immune response gene expression, and for LN- tumours with poor outcome, where there is an enrichment for genes involved in epigenetic modification. We developed and characterized Ellen, a gene signature that is designed to be predictive of outcome for all patients with ER+ breast cancer without distant spread, using an unbiased gene selection process. The genes represented in this signature are similar to those whose pathways were found to be enriched using GSEA, further suggesting that Ellen would be suitable for use in a variety of biologically unique ER+ breast tumours. Work is currently underway to validate the performance of Ellen using an alternate platform and with additional independent cohorts. Further, the clinical information available for the training and validation cohorts was limited, so it is difficult to know whether there are other confounding variables. Ultimately, this study shows that gene expression of primary tumours can be informative about metastatic potential and can be distinguished between LN- and LN+ patients. In addition Ellen, once validated, would be able to provide prognostic information for patients with tumours accompanied by small lymph node metastasis, such as isolated tumour cells or micrometastases, those with incomplete lymph node dissections (ie sentinel node only), or those who have no lymph node information.
Abbreviations
BC, breast cancer; C, concordance; CI, confidence interval; CoxPH, Cox proportional hazards; DMFS, distant metastasis free survival; EMT, epithelial mesenchymal transition; ER, estrogen receptor; ES, enrichment score; GEO, gene expression omnibus; GO, gene ontology; GSEA, gene set enrichment analysis; HR, hazard ratio; LN, lymph node; PAM, prediction analysis of microarrays; PTM, post translational modification; qRT-PCR, quantitative real time polymerase chain reaction; RMA, robust multichip algorithm; ROR, risk of recurrence; RS, recurrence score; TNM, tumour node metastasis