Introduction
It is known that the risk of breast cancer is related to lifetime exposure to estrogen [
1,
2]. Estrogen stimulates cell proliferation and increases the frequency of spontaneous mutations, leading to a malignant phenotype [
3]. Breast cells respond to estrogen via estrogen receptors (ERs) through a defined biochemical process: upon ligand binding, ERs undergo a conformational change that facilitates receptor dimerization, DNA binding, recruitment of ER cofactors, and modulation of target gene expression [
4‐
6].
Endocrine therapy provides strong evidence that attenuation of ER (
ESR1) activity can reduce breast cancer risk [
7], and women with ER-positive tumor would be the most likely to benefit from these treatments [
7,
8]. The genetic studies of
ESR1, however, have had contradictory results. Only recently, through a very large genetic association study, has there been demonstrated a small but significant association of polymorphisms within
ESR1 with the risk of breast cancer [
9‐
11]. Two plausible explanations for the inconsistent results might be due to the small sample sizes and thus limited statistical power of these studies, or that the risk was not evaluated by stratifying breast cancer patients based on tumor ER status. However, there is at least one further possibility: ER cofactors can either enhance transcriptional activity of ER as co-activators or inhibit the activity as co-repressors. The genetic variants within ER cofactors have not been systematically investigated in term of association with breast cancer risk, although some coding variants within individual genes, such as
NCOA3 and
CCND1, have been investigated [
12‐
15].
Given the modification of ER activity by its cofactors through their physical and functional interactions [
16], the cofactor proteins that bind to ER may be as important as the receptor itself in mediating transcriptional response to estrogen exposure [
17]. We therefore hypothesized that genetic variation within ER cofactor genes may alter cellular response to estrogen exposure and consequently, alone or by interacting with genetic variations within
ESR1, modify breast cancer risk in an ER status-dependent fashion. To assess this hypothesis, we investigated the association of common genetic variation, using a tagging SNP approach, within 60 cofactor genes in two large case-control samples of breast cancer from Sweden and Finland, and investigated their interaction with genetic variation within
ESR1 in terms of influencing the risk of hormone-driven breast cancer.
Materials and methods
Study population
The Swedish sample was from a population-based case-control study that has been described in detail previously [
18]. Briefly, 1,322 cases were Swedish-born women diagnosed with incident primary invasive breast cancer between October 1993 and March 1995 who contributed blood samples. All cases were postmenopausal and between 50 and 74 years of age at diagnosis. All the cases were identified through the six regional cancer registries in Sweden. The controls (
n = 1,524) were randomly selected from the Swedish Registry of Total Population with no previous breast cancer and were frequency-matched for age with the cases. Questionnaires were used to collect risk factor information.
The Finnish sample was from a hospital-based case-control study in which the cases consisted of two series of unselected breast cancer patients and additional familial patients diagnosed at the Helsinki University Central Hospital. The first set of cases were 884 patients collected in 1997/1998 and 2000, covering 79% of all newly diagnosed breast cancer cases during those periods [
19,
20]. The second set of cases, consisting of 986 newly diagnosed breast cancer patients, were collected during 2001 to 2004 and covered 87% of all such patients during that period [
21]. An additional 538 familial breast cancer cases were also collected at the same hospital, as previously described [
22,
23]. Women with a prior diagnosis of breast cancer
in situ were excluded, leaving 2,215 invasive breast cancer cases for analysis. Healthy female population controls (
n = 1,287) were collected from the same geographical regions of Finland as the cases.
Information on reproductive and hormonal risk factors was available for the Swedish sample and showed expected association patterns with breast cancer [
24‐
26]. Such information was not available for the Finnish controls.
Hormone receptor status information was retrieved from medical records of all participating cases and was available for both the Swedish and Finnish cases.
Approval for the study was obtained from the Institutional Review Boards in Sweden, Finland and the National University of Singapore. All subjects provided written informed consent.
DNA isolation
DNA was extracted from 4 ml whole blood using the QIAamp DNA Blood Maxi Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.
Candidate gene and tagging SNP selection
In the present study, the keywords 'ER cofactor', 'ER coactivator' and 'ER corepressor' were used in a literature search to identify ER cofactor genes. Boolean searching ('AND' 'OR') was used to narrow or broaden the search in PubMed. Using this method, 60 ER cofactor genes were identified as candidate genes. Tagging SNPs within the 60 candidate genes were selected based on the HapMap CEU data (Rel #22/phase II Apr07, on NCBI B36 assembly, dbSNP b126) [
27]. In brief, for each gene, all common SNPs with a minor allele frequency >0.05 within the gene and 5 kb surrounding region were first identified from the HapMap database [
28]. Tagging SNPs were then selected in Haploview version 4.1 [
29] using a pair-wise SNP tagging approach with
r2 > 0.8 used as the criterion for selection. A total of 806 tagging SNPs were selected within the 60 ER co-factor genes.
Genotyping
Illumina's GoldenGate assay was used for genotyping SNPs, following the manufacturers' instructions (Illumina, San Diego, CA, USA). In brief, all 806 tagging SNPs were subjected to genotyping assay design, out of which 790 SNPs were successfully designed and subjected to genotyping analysis. DNA samples were randomly assigned to the plates carrying positive and negative controls, and all genotyping results were generated and checked by laboratory staff unaware of the case-control status. SNPs with a call rate <96% (81 SNPs failed in the Swedish sample and 42 SNPs failed in the Finnish sample) and minor allele frequency <1% (18 SNPs in the Swedish sample and 40 SNPs in the Finnish sample) were excluded from further analysis. Deviation of genotype frequencies from those expected under Hardy-Weinberg Equilibrium were assessed in the control subjects. SNPs with Hardy-Weinberg Equilibrium P < 7.4 × 10-5 (0.05/675) were excluded (6 SNPs failed in the Swedish sample and 15 SNPs failed in the Finnish sample). In total, 685 SNPs from the Swedish sample and 693 SNPs from the Finnish sample were used for statistical analysis, and 675 shared SNPs between the Swedish and Finnish samples were used for analysis in the combined sample.
Genotyping was duplicated in 2% of samples (in both Swedish and Finnish samples) and there was concordance in >99% of the duplicated samples, suggesting high genotyping accuracy. With
r2 > 0.8, the average coverage of common variation (minor allele frequency >5%) within the 60 candidate genes was 91%. Out of these, 51 genes had coverage over 80% (Additional file
1 Table S1).
Reverse transcriptase-quantitative PCR analysis
MCF-7 cells were cultured in DMEM (Invitrogen, Carlsbad, CA, USA) medium with 10% FBS (Invitrogen). Prior to hormone treatment, cells were maintained in phenol-red free DMEM F-12 containing 5% charcoal stripped serum for 72 hours for hormone depletion. Cells were treated with 10 nM 17β-estradiol (Sigma-Aldrich, St. Louis, MO, USA) for a period of 0 or 3 hours. Cells were harvested and total RNA and reverse transcriptase-quantitative PCR analysis was carried out as described previously [
30]. Dimethylsulfoxide (Sigma-Aldrich, St. Louis, MO, USA)/vehicle-treated cells were used as controls for the same time course. Real-time PCR analysis was performed in the ABI Prism 7700 sequence detection system using SYBR Green from ABI (Applied Biosystems, Foster City, CA,USA).
Primers were designed using the online Primer 3 program [
31]. All experiments were repeated at least twice. Two sets of primers were used for identifying different isoforms of
PPARGC1B. The oligonucleotide sequences were as follows: PPARGC1B_1 isoform (NM_001172699.1) forward 5'-GAAGAGGAAGAAGGGGAGGA-3' and reverse 5'-CTCTGGTAGGGGCAGTGGT-3'; and PPARGC1B_2 isoform (NM_133263.3) forward 5'-CCTGAAGATGACGTGGGTCT-3' and reverse 5'-CCTTCCTTCTGGGTGTCAGA-3'. β-Actin specific primers (forward 5'-TCCCTGGAGAAGAGCTACGA-3' and reverse 5'-AGGAAGGAAGGCTGGAAGAG-3') were used as an internal control to normalize the amounts of reverse transcribed product used in the PCR reaction. Threshold cycle (Ct) values obtained for
PPARGC1B isoforms were normalized to β-actin Ct values. The normalized Ct (ΔCt) values were then used to calculate the difference (ΔΔCt) between estradiol-treated and dimethylsulfoxide-treated samples. The fold change of
PPARGC1B was calculated as 2
-ΔΔCt.
Statistical analysis
To measure the magnitude of association between SNPs and breast cancer risk, per-allele odds ratios (ORs) (assuming a log-additive model) and 95% confidence intervals were estimated using logistic regression. As the controls were younger than cases in the Finnish samples, age at diagnosis/enrollment (as a continuous variable) was included in the regression models in the Finnish analysis for OR adjustment. The Cochran-Armitage trend test was used to calculate P values in the Swedish and Finnish sample sets, separately in subtypes, and in cases overall. Inverse variance weighting was used in a meta-analysis for two independent datasets. The individual OR was obtained from age-unadjusted analysis in the Swedish sample and age-adjusted analysis in the Finnish sample. To evaluate differences in ORs between studies, a test of homogeneity was carried out for each individual SNP analysis (data not shown).
To determine the model of inheritance, associations between SNPs within the PPARGC1B gene and ER-positive breast cancer risk were estimated by assuming dominant, recessive and additive models in the two sample sets. We then performed these analyses with meta-analysis using inverse variance weighting approach. Individual ORs from two independent studies followed-up age-unadjusted analysis in the Swedish sample and age-adjusted analysis in the Finnish sample.
Forward stepwise logistic regression was used to explore whether the associations at the six SNPs were independent of each other. The selection criterion was P < 0.2. The analysis was performed in ER-positive breast cancer risk in the two sample sets separately as well as in the combined ER-positive sample dataset. To account for different minor allele frequencies in the two populations, a binary indicator variable for study was included in the regression models as well as age in the combined data regression analysis.
Pair-wise interaction analysis was performed under a dominant mode of inheritance using logistic regression and likelihood ratio tests. To maximize the statistical power, we pooled sample sets from the Swedish and Finnish data. Age and study were included in the model as covariables. The full model included an interaction term between the two interacting variables for the risk of breast cancer. In this multivariate logistic regression analysis, each coefficient provided an estimate of the log OR whilst adjusting for all other variables included in the model. Likelihood ratio tests, comparing models with and without the interaction term, were used to generate P values.
All analyses were performed using STATA version 8.0 (StataCorp, College station, TX, USA). Linkage disequilibrium (LD) calculation was performed in Haploview version 4.1 [
29]. All statistical tests were two-sided.
Discussion
To our knowledge, this is the first comprehensive association analysis of common variation within ER cofactor genes in breast cancer where 36 ER co-activators and 24 ER co-repressors were investigated. The utilization of two independent case-control samples of northern European origin allowed us to identify an association based not only on the overall significance in the large combined sample, but also on the consistency of the SNP association between the two individual samples. We found significant associations between PPARGC1B polymorphisms and risk for ER-positive breast cancer, and, importantly, we revealed a synergistic effect between the genetic polymorphisms within PPARGC1B and ESR1.
Genetic association studies of ER cofactor genes have so far been limited. Burwinkel and colleagues reported a significant association of coding variants Q586 H and T960T of
NCOA3 with familial breast cancer risk, and further suggested that familial breast cancer patients may condense the rare allele's contribution to the protective effect of breast cancer [
12]. Whilst two studies have reported an association of the variant Pro241Pro in
CCND1 with breast cancer risk [
37,
38], other studies have reported negative results for this variant [
14,
39,
40]. In particular, Wirtenberger and colleagues investigated the coding variant Ala203Pro of
PPARGC1B and found it to be associated with familial breast cancer susceptibility [
41]. In our study, we did not observe significant association between polymorphisms in
NCOA3 and
CCND1 with breast cancer risk. The Ala203Pro (rs7732671) variant of
PPARGC1B, however, is 10 kb away and not correlated with
PPARGC1B SNP rs741581 (
r2 < 0.05 in HapMap CEU data), and thus would not have been detected by our tagging SNP approach. Nevertheless, both Wirtenberger and colleagues' study and our study support the association of genetic variation of
PPARGC1B with particular subtypes of breast cancer.
Importantly, the association of
PPARGC1B as well as its synergistic interaction with
ESR1 was only observed in breast cancer patients with ER-positive tumors, as would be expected according to the biochemical mechanism of interaction. There is growing evidence that the impact of genetic risk factors on breast cancer varies by hormone receptor status. For example, recent studies by the Breast Cancer Association Consortium have led to the discovery of novel breast cancer susceptibility loci in
FGFR2,
TNRC9, 8q24, 2q35, and 5p12 that showed stronger association with ER-positive disease than with ER-negative disease [
42‐
45], with fibroblast growth factor receptor also being a direct target of ER. These data suggest the risk of ER-positive tumors that has been shown to be driven by reproductive factors in epidemiologic studies also has a genomic basis based on the constituents of the ER gene regulatory network [
46,
47]. In our study, although the sample sizes of two ER-positive datasets were smaller compared with the two overall datasets, the number of overlapping SNPs between the Swedish and Finnish studies was thus larger than that observed in the overall breast cancer analysis. Recently, we also demonstrated that genetic variation of the estrogen metabolism pathway - particularly the genes involved in the production of estrogen through androgen conversion - also influences the risk for the development of estrogen-sensitive breast cancer [
48]. As with this study, the effect size of the metabolism gene polymorphisms are relatively small but, taken together with
PPARGC1B and fibroblast growth factor receptor, show that the estrogen receptor signaling axis that engages both upstream and downstream components may have, in the composite, a significant role in the genesis of the most common form of breast cancer.
The genetic interaction between
PPARGC1B and
ESR1 is biologically plausible. The
PPARGC1B protein PGC-1β is a
bona fide ER co-activator [
34] that physically interacts with ERα and plays a role in amplifying ER signaling, which provides a convincing biological mechanism for the observed genetic interaction between the two genes. Furthermore, our series of transcriptional regulation analyses in the MCF7 ER-positive breast cancer cell line has demonstrated that
PPARGC1B expression can be induced by estrogen treatment, and this transcriptional response of
PPARGC1B is probably mediated by five functional ER binding sites around
PPARGC1B that are all engaged in interlocking chromatin loops highly indicative of an ER regulated gene [
35].
PPARGC1B may thus be involved in a feed-forward control mechanism with ERα such that ER induction (for example, by estradiol treatment) heightens the expression of a co-activator
PPARGC1B of ER, which in turn increases ER action at the DNA binding site. The feed-forward looping mechanism will therefore further augment the protein interaction between
PPARGC1B and
ESR1. This putative amplification effect, if confirmed, is another mechanistic model for epistatic interactions between genetic loci and may be one reason for the strength of its signal in the association study as compared with the other ER cofactors studied.
There are some limitations to our study. Coverage of common variation is not sufficient (< 80%) for some genes (Additional file
1, Table S1), so that some associations may have been missed. In addition, our tagging SNP selection provides a rather limited coverage of 5 kb surrounding sequences of the candidate genes, which may have contributed to some associations of regulatory SNPs being undetected, such as the one reported within
ESR1 [
11]. The number of overlapping SNPs between the two datasets is small for both ER-positive and overall breast cancer analyses. The limited overlapping could be due to ethnic heterogeneity between the two population samples and their moderate sample sizes. On the one hand, the ethnic heterogeneity may partially explain the low overlapping SNPs between two datasets; on the other hand, the current sample size is not large enough to capture the moderate effect of associated SNPs. Some of the top SNPs for each individual sample set are therefore probably false positive, which causes the small overlap between the numbers of significant SNPs in both datasets. The sample size limitation in ER-negative patients also could lead to the nonsignificant results in ER-negative analysis, since we observed that some associations in ER-negative analysis are in the same direction with ER-positive analysis. ER cofactors are known to work as a multicomponent protein complex, but due to a sample size limitation we are unable to detect interaction among three or more genes simultaneously. It is also worth noting that the contribution of genetic variation to cancer risk is based on both their prevalence and penetrance, and thus the relative importance of individual SNPs may vary from population to population. Further confirmation of our findings in other populations is therefore warranted.
Conclusions
Our study has revealed an association of genetic variation within
PPARGC1B with the risk of ER-positive breast cancer. Consistent with the known interaction of
PPARGC1B and ER at the molecular level, where
PPARGC1B modulates ER activity and thus ER signaling, our study revealed a synergistic effect between genetic variation within the
PPARGC1B and
ESR1 genes.
PPARGC1B has been shown to alter responses to the selective ER modulator, tamoxifen [
33]. Kressler and colleagues also demonstrated that
PPARGC1B indirectly co-activates tamoxifen-bound ERα, which cooperates with
NCOA1 to enable tamoxifen agonism in kidney and osteosarcoma cell lines. Lastly, the synergism demonstrated in the present study also suggests that disrupting the interaction between an ER co-activator - such as
PPARGC1B - and ERα, or blocking their mutual activation, may represent a sensitive and leveraged strategy for cancer prevention [
7]. Our study therefore provides new biological insight into the genetic basis of the more common ER-positive breast cancer and highlights that biochemically and genomically informed candidate gene study can enhance the discovery of interactive disease susceptibility genes.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
YQL, SW, KE, HN, KSC, PH, ETL and JJL initiated and designed the study. CB, KC and HN provided the study material and patient information. YQL, CB, TH, KA and SW collected and organized the data. YQL, YL, GLL, THC, DKV, JJL, KH and HD performed data analysis and interpreted results. YQL, SW, YL, DKV, KH, PH, ETL and JJL drafted the manuscript. CB, HD, KH, KE, HN, PH, ETL and JJL performed critical review and revised the manuscript. All authors read and approved the manuscript.