Introduction
Cancer mortality can be attributed mostly to metastatic disease, with an estimated 90% of deaths associated with solid tumors resulting from the pathophysiological impact of secondary disease. Despite many advances in both basic science and applied clinical research over recent years, advanced disseminated disease remains an incurable condition. Further investigations into the myriad of factors associated with metastatic disease are therefore warranted to identify critical molecular nodes and targets in this complex process that will enable development and deployment of new or improved clinical tools for combating the effects of advanced disseminated disease.
One of these factors for breast cancer is inherited metastatic susceptibility. Recently, using a mouse model system, it was demonstrated that germline polymorphisms have significant effects on the ability of a transgene-induced mouse mammary tumor to metastasize [
1‐
3]. Subsequently, using small pilot clinical cohorts, significant associations with markers of poor outcome were observed, consistent with the presence of metastasis susceptibility in the human populations [
4,
5]. Descriptive epidemiology studies further support this hypothesis, demonstrating familial clustering of outcome in a variety of different cancer types [
6‐
11].
The current study builds on and extends the previous studies of the first two identified metastasis efficiency modifier genes,
SIPA1 [
3] and
RRP1B [
4]. Using a much larger cohort, significant associations between polymorphisms in these genes and advanced disease were identified replicating earlier studies. Unexpectedly, however, these associations were restricted to subgroups of patients after stratification by the estrogen receptor (ER) and lymph node (LN) status. The results suggest that at least for inherited metastatic susceptibility in breast cancer that these subpopulations could be biologically distinct with different pathways leading to the metastatic disease.
Materials and methods
Patient population
The protocol to study biological markers associated with disease outcome was approved by the medical ethics committee of the Erasmus Medical Center Rotterdam, The Netherlands (MEC 02.953). This retrospective study used coded primary tumor tissue, in accordance with the Code of Conduct of the Federation of Medical Scientific Societies in the Netherlands [
12] and, as much as possible, was reported in line with the REMARK guidelines [
13]. The single nucleotide polymorphisms (SNPs) were determined in 1863 tumor tissues. ER levels were missing for nine patients and progesterone receptor (PR) levels for 104 patients. Data for one of the two SNPs were not available for 25 tumors. The final study includes breast tumor tissue specimens of 1725 female Dutch patients with primary operable breast cancer (990 patients underwent a mastectomy, 735 patients underwent breast-conserving lumpectomy) who entered the clinic in Rotterdam between 1979 and 2002 with ER and PR levels known as well as both SNPs rs2448490 and rs9306160. Radiotherapy was given to 1162 patients as part of primary treatment. Adjuvant therapy was not performed as part of the primary treatment for LN- patients. Of the LN+ patients, 24% (187 of 766) were received systemic adjuvant therapy. The median follow up of alive patients was 90 months (range, 4 to 231 months). The clinical questions addressed in the present study include the associations of the various SNP frequencies with patient and tumor characteristics, and prognosis in primary breast cancer.
Tumor ER and PR levels were determined in cytosolic extracts by routine ligand binding assay or by enzyme immunoassay [
14]. The cut point to classify primary breast tumors as ER and/or PR positive was 10 fmol/mg cytosolic protein. None of the patients had received neo-adjuvant therapy. Details on patient and tumor characteristics are presented in Table
1.
Table 1
Genotype distributions by patient and tumor characteristics
Total | 18631 | 748 | (40) | 807 | (43) | 293 | (16) | 616 | (33) | 904 | (49) | 333 | (18) |
Age (years) | | | | | | | | | | | | | |
≤40 | 239 | 98 | (41) | 98 | (41) | 43 | (18) | 82 | (34) | 123 | (51) | 34 | (14) |
41-55 | 736 | 301 | (41) | 323 | (44) | 103 | (14) | 235 | (32) | 359 | (49) | 139 | (19) |
56-70 | 597 | 228 | (39) | 265 | (45) | 98 | (17) | 193 | (33) | 285 | (48) | 114 | (19) |
>70 | 291 | 121 | (42) | 121 | (42) | 49 | (17) | 106 | (37) | 137 | (47) | 46 | (16) |
| |
P = 0.661
|
P = 0.481
|
Menopausal status | | | | | | | | | | | | | |
Premenopausal | 823 | 334 | (41) | 364 | (45) | 117 | (14) | 268 | (33) | 419 | (51) | 135 | (16) |
Postmenopausal | 1040 | 414 | (40) | 443 | (43) | 176 | (17) | 348 | (34) | 485 | (47) | 198 | (19) |
| |
P = 0.288
|
P = 0.165
|
Lymph nodes involved | | | | | | | | | | | | | |
0 | 1095 | 441 | (41) | 459 | (42) | 183 | (17) | 358 | (33) | 535 | (49) | 192 | (18) |
1-3 | 350 | 134 | (38) | 165 | (47) | 50 | (14) | 107 | (31) | 180 | (51) | 63 | (18) |
>3 | 418 | 173 | (42) | 183 | (44) | 60 | (14) | 151 | (36) | 189 | (45) | 78 | (19) |
| |
P = 0.438
|
P = 0.459
|
Tumor size | | | | | | | | | | | | | |
pT1 | 686 | 280 | (41) | 289 | (43) | 109 | (16) | 209 | (31) | 340 | (50) | 131 | (19) |
pT2 | 977 | 390 | (40) | 435 | (45) | 148 | (15) | 333 | (34) | 461 | (47) | 180 | (18) |
pT3/4 | 200 | 78 | (40) | 83 | (42) | 36 | (18) | 74 | (37) | 103 | (52) | 22 | (11) |
| |
P = 0.789
|
P = 0.049
|
Grade | | | | | | | | | | | | | |
Poor | 1007 | 403 | (40) | 437 | (44) | 162 | (16) | 326 | (33) | 508 | (51) | 168 | (17) |
Good/moderate | 282 | 113 | (41) | 120 | (43) | 44 | (16) | 101 | (36) | 122 | (44) | 56 | (20) |
Unknown | 574 | 232 | (41) | 250 | (44) | 87 | (15) | 189 | (33) | 274 | (48) | 109 | (19) |
| |
P = 0.994
|
P = 0.280
|
ER status | | | | | | | | | | | | | |
Positive | 1367 | 529 | (39) | 606 | (45) | 219 | (16) | 445 | (33) | 666 | (49) | 249 | (18) |
Negative | 487 | 217 | (45) | 194 | (40) | 74 | (15) | 169 | (35) | 233 | (48) | 82 | (17) |
| |
P = 0.087
|
P = 0.625
|
PR status | | | | | | | | | | | | | |
Positive | 1149 | 442 | (39) | 521 | (46) | 174 | (15) | 369 | (32) | 563 | (49) | 212 | (19) |
Negative | 601 | 267 | (45) | 232 | (39) | 99 | (17) | 207 | (35) | 290 | (49) | 99 | (17) |
| |
P = 0.018
|
P = 0.459
|
DNA isolation and whole genome amplification
Genomic DNA was isolated from two to ten 30 μm cryostat sections (5 to 20 mg) with the NucleoSpin®Tissue kit (Macherey-Nagel; Bioké, Leiden, The Netherlands) according to the protocol provided by the manufacturer. The quantity and quality of the isolated DNA was established by ultraviolet spectroscopy, by examination of the product size after agarose gel electrophoresis, and by the ability of the sample to be linearly amplified by real-time PCR in a serial dilution with a set of primers located in an intron of the hydroxymethylbilane synthase on chromosome 11 and thymidine kinase on chromosome 17. Samples not showing a DNA band of at least 20 kb or at 5 to 25 ng DNA not amplifiable by both real-time PCR assays were excluded. Prior to SNP genotyping, 10 ng aliquots of genomic DNA were amplified with the GenomiPhi V2 DNA amplification kit (GE Healthcare, Piscataway, NJ, USA) according to the protocol provided by the manufacturer, typically yielding 4 μg amplifiable genomic DNA with the 20 kb band still visible on gel.
SNP selection and genotyping
SIPA1 and
RRP1B polymorphisms were characterized using allele-specific PCR. PCR primers were designed using Vector NTI 9.0 software (Invitrogen, Carlsbad, CA, USA) according to parameters described elsewhere [
15] or purchased from Applied Biosystems (Foster City, CA, USA). Each probe was labeled with a reporter dye (either VIC
® (a proprietary fluorescent dye produced by Applied Biosystems) or FAM (5-(&6)-carboxyfluorescein)) specific for wildtype and variant alleles of each SNP.
The
RRP1B SNP rs9306160 was previously described [
4]. Briefly, it encodes a Pro436Leu missense mutation in the RRP1B protein, and tags an approximately 200 kb haplotype block encompassing both
RRP1B and the adjacent
HSF2BP gene. The primers and probes for the
RRP1B were as follows, 5'-3': forward, TGGACGTGGCCTCTGCAC; reverse, CACCACCTGCAGCCTGAAA; Vic labeled, AGGGCTTTCGGCCCAG; FAM labeled AGGGCTTTCAGCCCAGAG. The
SIPA1 SNP rs2448490 was genotyped using the ABI assay C__15797548_10.
Reaction mixtures consisted of 300 nM of each oligonucleotide primer, 100 nM fluorogenic probes 8 ng template DNA, and 2× TaqMan Universal PCR Master Mix (Applied Biosystems, Foster City, CA, USA) in a total volume of 10 μl. The amplification reactions were performed in a MJ Research DNA Engine thermocycler (Bio-Rad, Hercules, CA, USA) with two initial hold steps (50°C for 2 minutes, followed by 95°C for 10 minutes) and 40 cycles of a two-step PCR (92°C for 15 seconds, 60°C for 1 minute). The fluorescence intensity of each sample was measured post-PCR in an ABI Prism 7900 HT sequence detection system (Applied Biosystems, Foster City, CA, USA), and genotypes were determined by the fluorescence ratio of the nucleotide-specific fluorogenic probes. The genotyping success rate for rs2448490 was 99.2% (2491 of 2511 samples, controls and duplicates). The success rate for rs9306160 was 99.3%. The concordance rate for rs2448490 was 98.3% (404 of 411 duplicates) and 98.1% (404 of 411) for rs9306160.
Statistical analysis
Pearson's chi-squared statistic was used to study the relation of the variant SNP alleles with patient and tumor characteristics. The hazard ratios (HRs) for SNPs and traditional prognostic factors were determined with Cox proportional hazards models for both univariate (disease-free survival (DFS), metastasis-free survival (MFS), and overall survival (OS)) and multivariate regression analyses (with backward elimination) in 1725 patients. The assumption of proportional hazards was checked using Schoenfeld residuals. We stratified for ER because the assumption of proportionality was violated for ER. MFS was considered the major endpoint for the prognostic study. The endpoint for DFS was defined as any recurrence of the disease (958 events) including secondary breast cancer in the contralateral breast. Metastasis was defined as any distant recurrence (772 events) not including secondary breast cancer or local or regional recurrences. For OS, death from any cause was considered an event (n = 684). The HRs are represented with their 95% confidence intervals (CI).
Survival curves were generated using the Kaplan-Meier method, a log-rank test was used to test for differences between the survival curves or when appropriate the log-rank test for trend. Computations were performed with the STATA statistical package, release 10.0 (STATA Corp, College Station, TX, USA). All P values were two-sided, and P < 0.05 was considered statistically significant.
Discussion
Significant advances have been made in the understanding of breast cancer in the past decade. It is now understood that there are at least four molecular subtypes: luminal A, luminal B, basal and human epidermal growth receptor (HER)2-positive tumors [
18]. Furthermore, a variety of studies have demonstrated that gene expression profiles can discriminate between patients of differing outcome. As a result, a number of different commercial assays are currently available [
19] to aid patients and clinicians in their decisions for therapeutic intervention, two of which are currently in prospective clinical trials [
20,
21]. Despite the importance of these findings, the origins of the gene expression signatures are unclear. Based on the prevailing model, it was presumed gene expression signatures would be the result of an accumulation of somatic mutations during the evolution of the tumor. However, the ability to discriminate patient outcome based on bulk tumor expression data was considered inconsistent with that hypothesis, because only a small fraction of the tumor would be predicted to express the appropriate signature, as predicted by the progression model [
22]. These observations have led to a renewed discussion into the molecular mechanisms of breast cancer metastasis [
23,
24].
Studies in our laboratory have suggested that one of the previously unknown factors contributing to breast cancer metastasis is genetic background. Using an animal model system, we demonstrated that the genetic background had a significant impact on its ability to form pulmonary metastases [
2]. Subsequently, systems genetics approaches have identified a number of polymorphic metastasis efficiency genes [
3,
4,
25,
26]. These results therefore suggest that the prognostic gene expression signatures currently in clinical trials may be in part due to inherited polymorphism rather than somatic mutation, and may be a surrogate for inherited metastasis susceptibility segregating in the human population. This interpretation is strengthened by the recent demonstration that prognostic gene expression signatures pre-exist between normal tissues of animals of high- or low-metastatic genotypes [
27]. Taken together, these data support the hypothesis that genotype-based assays may be a valuable complement or supplement to clinical and gene expression-based prognostic tools.
This study therefore builds on the preliminary epidemiology studies of two of our previously described metastasis efficiency genes,
SIPA1 [
5] and
RRP1B, a chromatin associated protein of unknown function [
4,
28]. Initial investigations of
SIPA1 did not reveal associations with MFS [
5,
16]. However, recent analysis of HapMap database [
17] indicated that the SNPs investigated in these previous studies did not completely haplotype-tag this locus. Therefore an additional SNP was investigated in this study to improve coverage. In contrast, evidence for an association with a polymorphism in
RRP1B and MFS had been previously observed in two small pilot cohorts [
4]. This study therefore sought to replicate these results in a larger cohort, as well as to investigate whether there was a genetic, in addition to the physical, interaction between
RRP1B and
SIPA1 [
4].
The results of these studies suggest a number of important points. First, as predicted by the mouse genetic and pilot epidemiology studies, genetic background is likely to be an important factor for human breast cancer progression because significant associations were observed for both genes. Although the results are consistent with these associations resulting from inherited predisposition an important caveat of this study is that it is also formally possible that these results stem from copy number variation in the tumor DNA used, which is the only material available from this unique cohort. We believe, however, that this is unlikely for the following reasons. First, the results are consistent with the previous studies which were performed in constitutional DNA from normal lymphocytes. For RRP1B at least, this is unlikely to be a false-positive result because the same association has now been observed in three independent patient populations. Second, the allele frequencies of the SNPs does not vary between tumor types and subgroups, as might be expected if there was a preferential copy number change in a subset of tumors. Thus although at this time we can not formally rule out a contribution of somatic evolution we favor the hypothesis that these effects are likely due to inherited factors. Future replication in an independent cohort based on constitutional DNA will be resolve this possibility.
The second major point, as suggested by the physical interaction of the gene products, is that the combination of the SIPA1 and RRP1B SNPs is an independent predictor of MFS when compared with standard clinical parameters, capable of discriminating high-risk, intermediate-risk and low-risk individuals. This combination SNP assay may therefore provide a valuable addition to current methods. This SNP assay would have a number of advantages over current gene expression based assays. As it is based on constitutional DNA it can be performed from routinely collected peripheral blood, rather than tumor tissue, which require more invasive procedures. In addition, because DNA is more stable than RNA, there are fewer constraints on collection, handling and processing procedures. Furthermore, genotyping methods are relatively inexpensive, robust and rapid, and thus would likely be significantly less expensive than expression array based methods.
In addition to the potential clinical benefit, this study has important implications for our understanding of the mechanisms of metastatic progression. The fact that these polymorphisms are predictive of MFS in LN-, ER+, but not other subgroups suggest that at least for inherited metastatic susceptibility, there must be at least two pathways for metastatic progression. The lack of association in the LN+ samples indicates that these individuals are not simply diagnosed at a later time along a linear progression pathway. Instead, it suggests that those tumors that spread through the vasculature and those that seed the lymphatics likely use distinct molecular pathways during dissemination. This interpretation is consistent with previous observations in the literature. Analysis of breast cancer subtypes as defined by gene expression profiles [
18] demonstrated preferential sites of relapse [
29], suggesting different mechanisms of colonization. In addition, women with triple-negative breast cancers (ER-, PR-, HER2-) are less likely to experience a local recurrence before developing a distant recurrence [
30]. Similarly,
BRCA1 carriers have been shown to be less likely to have positive axillary LNs at diagnosis than non-hereditary breast cancers [
31]. To our knowledge, however, this is the first example of the ability of common allelic variants to discriminate patient outcome in specific clinical tumor types.
Although these results are consistent with constitutional predisposition to metastatic disease and suggest that the inherited susceptibilities for tumors that disseminate to the LNs is different than those that metastasize directly distant organs, the mechanisms used are currently unknown. It is possible that the allelic variants of these two genes might significantly alter the likelihood of tumors activating different pathways; for example, angiogenesis versus lymphangiogenesis, which would be expected to help direct tumor cells away from or toward sentinel LNs. At present, however, neither of these genes have been directly implicated in these pathways. SIPA1 is a RAPGAP signaling molecule [
32] and RRP1B is a chromatin associated protein of unknown function [
28]. Both molecules have been previously implicated in the expression of extracellular matrix genes, which in and of themselves have been associated with metastatic progression. As these variants are present in constitutional DNA the effect of the different alleles on metastatic disease could be due to modulation of tumor cells, the microenvironments the cells encounter or a combination of both. At present it is not clear which of these possibilities is most applicable. Because of these multiple possibilities and the complexity of each component, the exact biological mechanism by which these molecules operate is therefore likely to be complex and require significant additional efforts to unravel the exact mechanistic details.
Competing interests
JA Foekens received research support from Veridex LLC. Patent applications filed by NCI for RRP1B genotyping.
Authors' contributions
SH performed the genotyping and interpreted the data. ML performed the statistical analysis. AS prepared the samples. JF and KH designed the experiments and interpreted the data. SH, ML, JF and KH wrote the manuscript. All authors read and approved the final manuscript.