Background
Despite considerable progress over past decades, breast cancer remains the most frequent major cancer and the second most common cause of cancer-death in women, with approximately half of new cases being estrogen-receptor positive and lymph-node negative. Recurrence at a distant site is a key driver of mortality from breast cancer.
Hormone receptor status provides critically important classification of outcome and clinical benefit from adjuvant endocrine therapies. However, future progress against breast cancer will depend, in part, on expanding knowledge regarding novel constellations of genes involved in the risk of micrometastatic spread and subsequent progression. It is optimal that, where possible, the prognostic effects of these genes are discovered in a context not confounded by the continuously changing field of systemic therapy. The use of untreated patients for discovery and validation permits unequivocal identification of prognostic genes not confounded with response genes, thereby permitting pathway directed therapies to be considered and allowing identification of those patients who might avoid the morbidity of adjuvant systemic therapy without significant risk of metastasis. For this reason, we studied formalin-fixed, paraffin-embedded (FFPE) tissue to identify a prognostic gene-expression signature in women with operable, invasive breast cancer that was estrogen receptor positive and lymph node negative and who received no systemic therapy following surgical resection of the primary tumor.
Recently, several expression signatures have been described for predicting distant metastatic risk for breast cancer, and assays derived from these signatures have shown a potential to improve prognostic accuracy, treatment choice, and disease outcomes in women diagnosed with early-stage breast cancer [
1‐
12]. These signatures vary in the number of genes used, the types of tissues required (fresh frozen vs. paraffin-embedded), the technologies employed, and the platforms used. For example, MammaPrint [
1‐
3], a DNA microarray assay that uses frozen tissue, is based on a 70-gene prognostic signature. The 21-gene Oncotype DX test [
4‐
6] (16 cancer-related gene signature that includes the ER, PR and HER2 genes and 5 normalization genes), and the 2-gene ratio (HOXB13/IL17BR) test [
7,
8] are RT-PCR assays that use fixed tissues. Using a single data set, Fan et al [
13] compared the predictions derived from 5 different gene signatures that included the 3 noted above and found substantial agreement in outcome classification in 4 out of 5 signatures with the 2-gene ratio test being the exception.
Given the large number of genes purported to be prognostic for breast cancer, we selected a subset of these genes to analyze on patients with N-, ER+ tumors. Our study is notable for the a) absence of systemic treatment, b) broad and representative breast cancer population tested, and c) long periods of follow-up. By combining genes primarily from the 70-gene signature, the 21-gene panel and from those reported by Dai et al [
14] and West et al [
15], we sought to find markers that could be further validated in untreated and treated study populations. In this report, we present a 14-gene signature that is associated with risk of distant metastasis. The signature was derived from untreated women, has been validated in an independent sample set of untreated women and was additionally evaluated in a sample set of tamoxifen-treated women. The gene-expression profile can be carried out with routinely fixed tissue on existing real-time PCR instruments for widespread testing and may permit more effective selection of conventional therapeutic anti-cancer agents, alone or in combination, for clinical trials.
Methods
Patients
Training set of untreated patients
The training set was derived from a cohort of 393 subjects accrued from 1975 to 1986 at the California Pacific Medical Center (CPMC) with a diagnosis of lymph node-negative T1 and T2 breast cancer. The primary study investigated the prognostic utility of tumor grade and Ki-67 labeling index in N- breast cancer. The patient population was largely untreated by systemic adjuvant therapy. All patients were followed for a minimum of 8 years or until death with a median follow-up time of 14.8 years. Tumors were classified according to WHO guidelines [
16] and histological grade established using the modified Bloom and Richardson method [
17]. The ER status of patient samples was determined by ESR1 ligand-binding methods. The use of patient material and data for this study has been approved by the institutional medical ethics committee.
We profiled 315 patients from whom sufficient amounts of amplifiable mRNA were extracted from formalin-fixed, paraffin-embedded tissues. Since ER status was missing for a large number of subjects (114 samples or 36%), we chose to reanalyze and standardize the ER results using mRNA measurements. Of the 315 patients, 106 subjects were excluded either because they received systemic therapy (N = 12) or because they were found to be ER-negative (N = 90) or both (N = 4), and 67 subjects met the inclusion criteria but were missing relapse clinical information, leaving 142 patients for further analysis.
Validation set of untreated patients
A retrospective search of the Breast Tissue and Data Bank at Guy's Hospital was made to identify an analogous cohort of patients diagnosed with primary breast cancer and who had definitive local therapy (breast conservation therapy or mastectomy) but without additional adjuvant systemic treatment. The study group was restricted to women diagnosed between 1975 and 2001, with a clinical tumor size of 3 cm or less, pathologically uninvolved auxiliary lymph nodes, ER+ tumor and with more than 5 years follow-up or recurrence or death prior to 5 years. Tumors were classified according to WHO guidelines [
16] and histological grade established using the modified Bloom and Richardson method [
17]. ER status on this group of patients had been determined using the standard IHC assay but we chose to reanalyze and standardize the ER results using mRNA measurements to be consistent with the training set (see Additional file
1 for concordance of ER status by IHC and RT-PCR). The standard Dako HercepTest method was used for scoring and defining positive/negative status. HER2 IHC testing was carried out on a Biogenex i6000 autostainer using Dako A0485 HER2 antibody (diluted 1:1000) with detection by Envision/HRP kit (Dako K5007). Cases with a HER2 score of 2+ or greater considered positive and 1+ or 0, negative.
A total of 415 patients were identified who also had sufficient FFPE tissue available for RNA extraction. From this group there was sufficient quantity and quality of mRNA to profile tumors from 303 patients. A further 24 cases were excluded from the study: 4 patients had bilateral breast cancer prior to distant metastasis, 6 had a missing gene expression value, 9 tumors proved to be ER-negative upon re-assessment using the mRNA expression assay, 3 were node positive and 2 were male patients. Thus, in total 279 patients were included in the analyses. The median follow-up time of the 279 patients was 15.6 years. The use of patient material and data for this study has been approved by Guy's Research Ethics Committee (04/Q0704/137).
Tamoxifen-treated patient study
A cohort of 45, N-, T1, ER+ patients who had received tamoxifen therapy and underwent surgery between 1990 and 1999 from the University of Muenster, Germany was used. The median follow-up time of the 45 patients was 5.8 years. The use of patient material and data for this study has been approved by the institutional medical ethics committee.
Endpoints
We chose time from surgery to distant metastasis, also referred to as distant metastasis-free survival (DMFS), as the primary endpoint. Subjects were considered to have an event at the time of diagnosis of distant metastases or were censored at the earliest occurring date of contra-lateral recurrence, death without recurrence or last follow-up. The definition of DMFS endpoint, its events and censoring rules were aligned with those adopted by the National Surgical Adjuvant Breast and Bowel Project (NSABP) for the prognostic molecular marker studies [
4]. We also analyzed the endpoint of overall survival (OS), which was defined as time from surgery to death from any cause.
Sample processing
Five 10 μm sections of each paraffin block were used for RNA extraction. A macrodissection on the samples was performed to isolate RNA from the cancer cell areas which had been marked by a pathologist on a guide slide. Total RNA was extracted from the FFPE tissue sections using a modified commercially available isolation kit (Zymo Research, Orange, CA). Briefly, the FFPE section slides were deparaffinized in xylene, washed consecutively with 100%, 90%, and 70% ethanol, air dried at room temperature and the tissues transferred to a tube. Following digestion with proteinase K for 18 to 24 hours at 55°C, the samples were spun down and the supernatants transferred to new tubes. A mixture of 100% ethanol and extraction buffer was added to the supernatant and loaded onto Zymo-Spin II Columns. The columns were treated with a series of washes that include a DNase treatment step. Total RNA was eluted with TE buffer that was heated to 65°C.
Determination of amplifiable RNA
The quality of the RNA extracted from formalin fixed tissues varies and depends on a variety of factors, including the fixation process used, age of the samples, and storage conditions. Additionally, the use of formaldehyde can cause extensive cross linking of tissue components. In many cases, only a small fraction of the recovered RNA can be amplified by RT-PCR. To determine the amount of amplifiable RNA, we quantified the expression level of an endogenous gene, NUP214, in each sample by comparing it to a serially diluted Universal Human Reference RNA standard (Stratagene, La Jolla, CA). Approximately 0.5 ng of amplifiable RNA was used to profile each gene.
Gene selection
We selected 197 candidate genes (see Additional file
2) from the published literature that include the prognosis genes reported by van't Veer et al [
1] and Dai et al [
14], the gene signature for response in tamoxifen treated women reported by Paik et al (8), and the ER status genes reported by West et al [
15]. In addition, three endogenous "housekeeping" genes (NUP214, PPIG, and SLU7) were included and used to normalize expression levels of the other genes (see Additional file
3).
Message enrichment
The amount of PCR-amplifiable RNA from the training set was insufficient to profile all 197 genes. Consequently, RNAs from the training set were enriched by pre-amplification with the MessageAmpII aRNA amplification kit (Ambion, Austin, TX) whereas RNAs in the validation sets were used directly. To assess the effect of the enrichment, we profiled and compared the14 genes using 50 paired enriched and unenriched samples from the training set. Good correlation (R
2 = 0.9931) was observed between the metastasis scores generated with the enriched vs. unenriched samples (see Additional file
4).
Gene expression profiling
A single-step RT-PCR with SYBR
® Green was used for gene expression profiling essentially as previously described [
18]. The assays were performed on the Prism 7900 Real-Time PCR system using the following thermocycling parameters: 50°C for 2 minutes; 95°C for 1 minute; 60°C for 30 minutes; 95°C for 15 seconds and 60°C for 30 seconds for 42 cycles. A Universal Human Reference RNA control was amplified with the appropriate candidate gene for each run. All assays were performed in duplicate. PCR primers were designed to amplify all known splice-variants, and the size of the PCR product was designed to be shorter than 150 bp to accommodate degraded RNA in archived FFPE samples.
The relative changes in gene expression were calculated by the ΔΔCt method [
19]. The expression of each of the genes was first normalized to three endogenous control (HSK) genes then further normalized to a calibrator, reference RNA pool (Universal human reference RNA, Stratagene, La Jolla, CA). The ΔΔCt values of 197 genes which gave acceptable expression levels were used for statistical analyses. In the validation set, we only profiled 3 normalization genes and the 14 cancer-related genes that were selected in the training set for the prognostic signature, and ESR1 gene for ER status determination.
Estrogen receptor status by expression analysis
After developing an ER mRNA cutoff for estrogen receptor status on separate samples (Iverson et al, J Mol Diagn, in press) and demonstrating a high concordance with IHC determination in the training and validation sample sets, we chose to use ER expression as criteria for ER status for consistency between sample sets (see Additional file
1).
Ki-67 Labeling Index (LI) in training set
The MIB-1 monoclonal antibody to Ki-67 (AMAC, Inc, Westbrook, ME) was used at 1:200 dilution in PBS. Following standard preparation of slides, staining was visualized using biotinylated anti-mouse (Vector Laboratories) and Strepavidin-horseradish peroxidase (Zymed Laboratories); DAB was used as the chromogen, and hematoxylin as the counterstain. The invasive cancer on a slide was reviewed for immunoreactivity. The slide was first scanned using a 10× objective, and regions with high labeling chosen for counting at high power (40×). The Ki-67 LI was calculated as the fraction of positively stained nuclei in at least 1000 invasive cancer cells from multiple high power fields. Tumors above the median labeling index were categorized as high Ki-67; those below the median labeling index, as low Ki-67.
Statistical analyses
For gene selection in the training set, the expression levels of each gene were standardized to have mean zero and variance of one, and missing values were imputed using a nearest neighbor algorithm [
20]. We used the semi-supervised principal component (SPC) method [
21,
22] available in the PAM software package [
23] with a Cox regression model [
24] for time to distant metastasis to each gene. The genes were ranked by their univariate Cox scores. The first principal component of the genes that reached a certain threshold of the univariate Cox score was computed and applied in a Cox model with the principal component as a single variable. Internal cross-validation was used to determine the optimal threshold to select genes to optimize the Cox score with the principal component of the expression of the selected genes. With this procedure, 37 genes were selected from the training set. The number of genes included in the prognosticator was further reduced by a regression of the supervised principal component on the expression values of the 37 genes (see Additional file
5) while imposing a constraint on the size of the regression coefficients. This procedure, known as the Lasso [
25], resulted in a linear combination of the expression values of 14 genes (the remaining gene coefficients were effectively shrunk to zero) that provided a good approximation of the supervised principal component. For simplification, since the regression coefficients of the 14 genes were of similar magnitude (see Additional file
6), a summary score was calculated as the sum of the 14 ΔΔCT measurements for each subject, and since lower values of the score were associated with higher probability of metastasis, the final metastasis score (MS) for each subject is defined as the negative of the summary score. The creation of the MS is described in the Additional file
7.
Differences in patient characteristics were assessed with the Wilcoxon rank-sum or Kruskal-Wallis test for continuous or ordinal measures and with the chi-square test for discrete measures. Cox proportional hazards models were used to estimate hazard ratios and Wald tests of the coefficients from these models were used to assess statistical significance of the variables. The MS was modeled both as a continuous variable as well as in discrete groups of high (≥ -23.5) and low (< -23.5) risk, the cut-point of which was determined as the median MS in the training set. Other covariates included in the multivariable Cox models included years of age (at surgery), tumor size (cm), histologic grade of tumor. HER2 status had also been ascertained on the validation set using the FDA approved scoring method. These data were used as a covariate for an additional multivariable model in the validation set. Estimates of distant metastasis free and overall survival for the high and low MS groups were calculated with the method of Kaplan and Meier [
26] and confidence intervals for point estimates of survival were calculated using the complementary log-log transformation [
27]. The probability of distant metastasis in 5 years and 10 years for individual patients was calculated from the survivor function as estimated by an accelerated failure time model including the continuous MS as the independent variable and assuming the event times have a Weibull distribution [
28]. Statistical analysis was performed with PAM [
23], SAS software version 9.1 [
29] and R software version 2.4.1 [
30].
Time dependent receiver operator characteristic (ROC) curves and area under the curve (AUC) to predict distant metastases within 5 years and 10 years and death within 10 years were estimated using the method described by Heagerty [
31] with nearest neighbor estimation of the bivariate distribution of time and continuous MS [
32]. Sensitivity and specificity were estimated for each time of interest using the cut-point defining high and low MS risk groups. Approximate 95% confidence intervals for the various diagnostic summary measures were calculated based on the standard errors estimated from 500 bootstrap samples [
33]. Ten year risk estimates for relapse and mortality based on the Adjuvant! Online calculator [
34] were obtained for the patients in the untreated validation set and subsequently used to plot the 10-year time dependent ROC curves for visual comparison with the ROC curves based upon the MS.
Discussion
Even though several breast cancer prognostic signatures have been published, the study described here is notable for several reasons. The use of untreated patients for the training and test sample sets permits unequivocal identification of prognostic genes that are not confounded with response genes, thereby providing insight into pathway directed therapies and opportunities for basic research. The prognostic signature does not contain ER, ER-responsive genes or HER2 and therefore circumvents the expressed concern that expression signatures should provide information independent of these valuable and routinely tested IHC markers. In addition, we have shown that the signature provides additional information than the commonly used Ki67 proliferation marker. This signature is expected to be generalizable given the consistent results observed in the geographically diverse sample sets. Our results further suggest that the prognostic score from untreated patients retains its prognostic value in tamoxifen-treated patients. The relatively small number of genes in the described signature will facilitate follow up functional studies in support of their mechanistic role in distant metastasis. Finally, the relatively small number of genes in this prognostic signature, which does not depend on a complex algorithm, coupled with the wide-spread use of fixed tissue and familiarity of RT-PCR should facilitate the broader transfer of these types of analyses to multiple testing laboratories as well as facilitate submission of in vitro diagnostic products to regulatory agencies.
We selected genes from 3 previously reported prognostic gene signatures plus ER-related genes and analyzed the expression of 197 genes in a training set of non-systemically treated, N-, T1/T2 (≤ 3 cm), ER+, breast cancer patients. A subset of 14 genes, found to be prognostic for breast cancer, was used to generate a metastasis score (MS) to quantify risk for individuals at different timeframes as well as dichotomize samples into high and low risk groups. Following initial selection and analysis within the training set, we validated the expression signature on an independent sample set using the precise dichotomized cutoff of the training set. Performance characteristics of the signature in the training and validation sets were similar. Univariate and multivariate hazard ratios to predict DMFS were 4.34 and 3.16 in the training set and 4.71 and 4.02 for the validation set, respectively. In multivariate analysis, only the metastasis score remained significant. The 14-gene prognostic signature also predicts overall survival with univariate and multivariate hazard ratios of 2.48 and 2.00 in the training and 2.26 and 1.97 in the validation set, respectively. When comparing the predictive accuracy with a commonly used Adjuvant! Online, the areas under the ROC curves were slightly higher for the 14-gene signature classification than for the Adjuvant! classification indicating MS may provide additional diagnostic value.
We were curious whether the signature developed in patients without systemic therapy would be predictive in tamoxifen-treated patients. In a study of a small number of tamoxifen-treated women, the signature predicted two risk groups using the same single cutpoint as for untreated patients, but the results only trended to significance due likely to sample size. Since tamoxifen treatment only reduces distant recurrence by approximately 30%, larger data sets will be required to discern the prognostic nature of the signature in women who do and do not respond to tamoxifen.
Several investigators [
35,
36] have queried whether molecular expression scores provide discrete information to those routinely provided by single or composite pathological prognostic tests already routinely provided. As an example, Ki-67 LI determined proliferation status has been reported in numerous individual studies as well as a meta-analysis study to be a prognostic factor for recurrence-free and disease-specific survival [
37‐
44]. We tested Ki-67 LI because of the strength of reports in literature and availability in the training set. Ki-67 labeling index was predictive for recurrent disease; however, after adjustment for the metastatic expression signature this often used marker lost significance. As with two previous reports (Potemski et al [
42] and Tan et al [
43]) we did not find a strong correlation between the Ki-67 LI full range of staining and the mRNA levels of this gene.
The 14 upregulated genes represent a unique signature and do not fully overlap with any of the original 3 signatures from which the genes were selected. Three proliferation genes (BUB1, CCNB1 and MYBL2) highlighted in Whitfield et al [
45] appear in the 14 gene signature described here but only MYBL2 overlaps with the p53 status signature recently reported by Miller at al [
46]. Even though the TP53 genes have not appeared in lists of proliferation genes, network analysis of the genes of the proliferation signature described here is suggestive of network involvement (see Additional file
11). The signature lacks the ER and PgR genes. The absence of these hormonal receptors is not unexpected given that these genes have been reported to be weakly prognostic in untreated patients. The majority of the genes in the signature are involved in processes associated with tumor growth such as DNA replication (BUB1, CCNB1, CENPA, ORC6L, RFC4, TK1), cell cycle control (BUB1, CCNB1, MYBL2, ORC6L, PKMYT1, RACGAP1), cellular assembly and organization (BUB1, CCNB1, CENPA, DIAPH3), and ubiquitination (UBE2S). Many of the genes in the signature have been implicated in cancers. The known and inferred role of these genes in cell proliferation is consistent with their contribution to the disease process. While the 14-gene tumor expression profile reported here has practical importance in classifying distant metastasis as an outcome in patients with operable, invasive breast cancer, the identification of prognostically relevant gene pathways has ramifications for targeted therapy in the future, with applications to conventional cytotoxic drugs and novel experimental therapies [
47‐
49].
The sample population and the experimental approaches we employed vary in some aspects from previously reported studies. First, the signature was developed and validated on FFPE samples from non-systemically treated breast cancer patients to capture solely prognostic information without confounding by genes that may play a role in recurrence and/or response to treatment. In contrast, Oncotype Dx [
4] was trained in tamoxifen-treated patient samples – which may have contributed to the identification of ER and PgR as important markers. As discussed by Hayes [
50], ER and ER-related genes are known to be positive predictors of endocrine therapy but only weakly prognostic. Second, our study population has a broad distribution of age covering both pre- and postmenopausal women that is representative of a typical breast cancer patient population. In comparison, the MammaPrint signature [
1,
2] was developed using samples from primarily younger women and the Oncotype DX signature [
4] was developed using clinical trial samples. Third, the number and equal weighting of each of the genes of the signature permits more focused follow-up mechanistic studies. Fourth, the long duration of follow-up in the validation set allows quantification of risk over different time frames as well as categorizing risk into different groups. This is important as individuals differ substantially in their risk tolerance and time horizon concern. Fifth, the signature was developed on FFPE samples and expression analysis was performed using RT-PCR. This sample type enables analysis of archived sections that have extended outcome data as well as present day specimens that are routinely processed in a similar manner. Gene signatures developed on frozen tissues (for example, MammaPrint and wound response signatures [
12]) would require a change in present sample collection and storage. Finally, clinical data reported by Esteva [
51] suggest that a multigene expression profile assay, trained on tamoxifen treated samples, may not necessarily classify the risk of recurrent disease in patients with N(-) breast cancer who do not receive adjuvant tamoxifen or chemotherapy. The 14-gene prognostic signature reported here was developed on untreated patient samples, and as suggested by one of the referees, one potential implication of the current study is that the 14-gene expression signature may identify a low-risk patient-group with hormone receptor-positive breast cancer, whose predicted absolute survival benefit from systemic adjuvant therapy is so low that a woman, armed with this prognostic information, may favor the avoidance of the occasionally troublesome side effects of endocrine therapy.
The reported study has limitations. In order to identify a cohort of non-systemically treated patients, it was necessary to assemble samples from patients before tamoxifen became a routine treatment option. As a result, the samples in this cohort may not represent ER+ breast cancer patients today. In this study, we used a retrospective population-based cohort study design. While a cohort study is expected to have fewer hidden confounders and biases than a case-control study, we cannot exclude the presence of masked bias. Further, population-based cohorts have less uniformity than patients from the controlled setting of clinical trials. On the other hand, such studies are likely to be more representative of a community setting in which the molecular prognostic assay would be applied [
52].
Competing interests
JJS, AW, CR, SK, KL and SB are employees of Celera Corp; SA and LK-M are employees of LabCorp.
Authors' contributions
AT, FW, JS, BB, JG, SA and LE conceived the study. AW, KL, SB, CR and JS participated in the study design and data analysis. CG, RS, KC, PC, HB and JB provided clinical samples and background. KL, TH, KR and CR contributed to the statistical analysis. AW and SK supervised the sample preparation and RNA enrichment; HS, BM, and LK-M performed the RNA extraction. HD provided cancer gene list prior to publication. All authors have read and approved the final manuscript.