Introduction
The
BRCA1 and
BRCA2 tumor suppressor genes have been established as important high penetrance familial breast cancer susceptibility alleles [
1]. Rare mutations of other tumor suppressor genes involved in direct protein-protein interaction with
BRCA1/2 including
TP53, PTEN, CHEK2, ATM, NBS1, RAD50, BRIP1, and
PALB2 were also discovered in breast cancer families, altogether accounting for up to 50% of familial breast cancers [
2,
3]. On the other hand, rare germline alterations of potential disease genes have not been investigated for the most common non-familial (sporadic) form of breast cancer, which accounts for the majority (70 to 80%) of all breast cancers in the population.
Tumor suppressor genes known to be somatically inactivated in breast cancers are particularly attractive candidates. SMAD3 and SMAD4 are the key signaling proteins of the transforming growth factor-β (TGF-β) pathway and have been implicated to have tumor suppressive effects in the pathogenesis of breast and other cancer types [
4,
5]. The binding of TGF-β to TGFBI and TGFBII receptors results in the activation of SMAD2/3 and hetero-complex formation with SMAD4 [
6] and mediates the regulation of genes involved in the suppression of epithelial cell growth following nuclear translocation. SMAD3 and SMAD4 possess two evolutionarily conserved domains termed Mad-homology 1 (MH1) and 2 (MH2). The N-terminus MH1 domain is a DNA-binding domain recognizing CAGA motifs. The C-terminus MH2 domain is highly conserved and is one of signal transduction's most versatile protein-interacting domain. It is involved in the interaction with TGFBR1, formation of SMAD homomeric or heteromeric complexes, and transcriptional activation (Reviewed in [
7]).
The loss of SMAD3 expression and function is involved in susceptibility to gastric cancers, colorectal cancers and acute T-cell lymphoblastic leukemia [
8‐
10]. Several lines of evidence suggest that
SMAD3 may be involved in breast cancer susceptibility. The
SMAD3 locus on chromosome 15q21 has been shown to undergo allelic imbalance [
11]. In addition, SMAD3, like many breast cancer susceptibility genes, is in direct protein-protein interaction with BRCA1 as it counteracts BRCA1-mediated DNA repair [
12] and its MH2 domain has recently been shown to associate with BRCA1 during oxidative stress response [
13]. While inactivating mutations in
SMAD3 were previously believed to be absent in all cancer types [
14], a putative inactivating missense mutation (R373H) was found in the colorectal cancer cell line SNU-769A [
15] as well as c.1009+1G > A and c.1178C > T (P393L) from the screening of 38 primary colorectal cancers [
16] both localized to the MH2 domain.
SMAD4/DPC4 is a tumor suppressor gene, which is mutated or deleted in half of all human pancreatic carcinomas [
17] and loss of expression (LOH) has been shown to be important for the progression of gastric [
18], cervical [
19] and colorectal [
20] cancers. At least 15% of breast tumors exhibit LOH at the 18q21 locus on which
SMAD4 is situated [
21] and breakpoints in this region are associated with minimum copy number [
22] suggesting a tumor suppressor role. In addition to pancreatic cancer,
SMAD4 is somatically inactivated in colon and biliary cancers [
23], gastric cancer [
24], homozygous deletions of
SMAD4 have been detected in a small percentage of invasive ductal carcinomas [
25,
26]. In the germline, inactivating
SMAD4 mutations are found to be associated with approximately 20% of Juvenile Polypopsis Syndrome (JPS) cases [
27,
28]. Consequently, mutation analyses in many cancers have highlighted the MH2 domain of SMAD4 as a mutational hotspot [
29].
Presently, it is not known whether SMAD3 and SMAD4 germline alterations are involved in breast cancer predisposition. Here, we aimed to explore the mutation spectrum of SMAD3 and SMAD4 by screening the highly conserved MH2 domain in the germline DNA in familial and non-familial breast cancer cases as well as age, gender and ethnicity matched healthy population controls.
Discussion
BRCA1 and BRCA2 are the most prominent breast cancer susceptibility genes. However, there remains a need to identify additional susceptibility genes as it has become increasingly evident that BRCA1/BRCA2 mutations cannot explain all cases of familial breast cancer. Two candidate genes that are of potential interest in clinical genetics of breast cancer are SMAD3 and SMAD4, encoding the key signaling transduction proteins of the Transforming Growth Factor-β (TGF-β) pathway. The loci on which they reside are frequently lost in breast cancer but whether germline variants are playing a role in predisposition of breast cancer has not been studied.
For the discovery of the variants we applied the dHPLC methodology, complemented by direct sequencing, which has been reported to have over 95% sensitivity and accuracy in detecting genetic variations [
50]. We have targeted the analysis of the functionally critical MH2 domain because it has been shown to be a mutational hot spot in
SMAD4 [
29], the region where the putative
SMAD3 mutations had been identified [
15,
16] and the region that interacts with BRCA1 [
12]. Thus we reasoned that a comprehensive screen of the exons encoding the MH2 domain and surrounding intronic region represents the most effective design to detect novel
SMAD3 and
SMAD4 mutations.
Based on current understanding, mutations in SMAD3 are absent in almost all cancer types while mutations of SMAD4 are frequent in pancreatic and colorectal cancers but rare in breast cancer. However, it has been difficult to ascertain whether SMAD3 and SMAD4 mutations in breast cancer are truly rare or this understanding is due to the comparatively small sample sizes screened as noted from COSMIC. Furthermore, whether inactivating germline mutations are playing a role in breast cancer susceptibility has not yet been investigated.
Our analysis did not detect coding variants in the MH2 domain of
SMAD3. In
SMAD4 we identified two novel coding variants c.1350G > A (p.Gln450Gln) (P9), and c.1701A > G (p.Ile525Val) (C24) in a breast cancer case and control population, respectively, in addition to the previously known c.1214T > C (Phe362Phe) (rs1801250). As it has been suggested that SMAD3 and SMAD4 mutations are rare in breast cancer [
14,
26], we quantitatively assess whether this is the case in the germline. The identified variants were normalized relative to the base pairs screened and individuals assessed (θ) and our case-control results were compared to three large studies that have established an approximate frequency based on mutation analysis of germline DNA of healthy individuals representative of the natural rate of mutation.
We found that the nucleotide diversity in both cases and controls in the coding region of SMAD3 to be far less than all three reference studies. This difference is not attributable to a discrepancy in sensitivity of detection of germline variants since there were comparable frequencies for non-coding variants for both SMAD3 and SMAD4. This strongly supports that SMAD3 alteration is very infrequent and suggests that the MH2 domain is under stringent selective pressure where deleterious mutations impeding proper function could also negatively influence tumorigenesis. Within the coding region of SMAD4, on the other hand, nucleotide diversity estimations indicated that variants in cases and controls appear to occur at a similar, albeit lower, rate than the reference samples. This demonstrates that SMAD4 is not preferentially mutated in the breast, though rare genetic alterations may exist in the MH2 coding region.
The non-coding regions have a higher θ compared to the reference studies. However, it should be noted that both the Cargill et al. and Halushka et al. studies remarked that their non-coding regions are comprised of perigenic sequences (< 18 bp from the exon) while our study spans up to 150 bp of the intron and may be more representative of a neutral rate of polymorphism. In fact, the study by Cargill et al. suggest that the θ for four-fold degenerate sites reported in their study had the highest nucleotide diversity (θ = 9.73 ± 2.46) and may approximate the neutral rate of polymorphism. If this θ is assumed to be the neutral rate of polymorphism then what was observed in the non-coding regions of SMAD3 and SMAD4 cases (θ = 13.24 ± 7.02, 7.56 ± 3.36, respectively) and controls (θ = 11.24 ± 4, 11.71 ± 4.17) would be in agreement.
Intronic variants, which constituted the major type identified in this study, are increasingly found to be associated with splicing defects (and ESE/ESS alterations) causing cancer among other disorders [
51]. However, RT-PCR analysis has shown the absence of any aberrantly spliced transcripts, and no exon skipping was observable in any sample, including the novel
SMAD4 c.1350G > A variant (P9). It is also possible that the aberrant transcripts are unstable and their degradation may have occurred during the blood processing. Although it is true that variants disrupting ESEs are associated with decreased splicing efficiency and/or splicing defect, there have been instances in which gain of function ESE mutation strengthens the enhancer element resulting in preferential exon inclusion. For example, most mutations of microtubule-associated protein tau (MAPT) that are associated with (frontotemporal dementia and Parkinsonism associated with chromosome 17 (FTDP17), a condition related to Alzheimer's disease, are translationally silent but increase splicing efficiency of exon 10 that increases the rate of inclusion through strengthening ESEs at the 5' end or weakening ESS at the 3' end [
52]. In this regard the c.1350G > A variant may be prioritized for further studies. Based on these results it appears inactivating SMAD3 and SMAD4 germline mutations and splicing defects appear to occur very infrequently in breast cancer.
While the absence of inactivating MH2 germline mutations from this study provides compelling evidence that
SMAD3 and
SMAD4 mutations are truly rare in breast cancer, this study cannot comprehensively exclude the presence of other mutations since the Mad-Homology 1 (MH1) and the variable linker region were not screened. However, with respect to
SMAD3, our screening did not detect coding variants, within the MH2 domain, including the ones previously identified in colon and pancreas. Given that the
SMAD3 mutations are infrequent and that its expression is elevated in peripheral blood and tumor tissues, SMAD3 does not seem to be inactivated and is unlikely to contribute as a tumor suppressor during breast cancer development. With respect to SMAD4, 90% of all known somatic
SMAD4 mutations reported are located in the MH2 domain, suggesting that the number of undetected mutations is expected to be low when analysis is confined to this mutation hotspot. This is also supported by mutation analysis conducted in JPS by Howe
et al., [
28] showing that in 77 patients, inactivating germline
SMAD4 mutation were found in 18.2% (14/77) of the samples and of these, 16.9% (13/77) occurred in the MH2 domain. Similarly, mutation germline analysis by Pyatt
et al., [
27] showed that
SMAD4 is mutated in 18.6% (13/70) of the 70 JPS patients screened and of these, 12.7% (9/70) occurred in the MH2 domain. Lastly, a mutation screen of 56 patient thyroid tumor samples by Lazzereschi
et al. 2005 [
53] identified
SMAD4 MH1 mutations as well as linker mutations leading to splicing defects. Nevertheless, the authors also found that more than half (53% (8/15)) of the mutations were missense mutations in the MH2 domain. By contrast, our study of 408 patient samples and nucleotide diversity analysis both show that inactivating MH2 domain mutations appear to be absent. Thus, by inference the remaining part of the gene is expected to harbor only very rare mutations. It should be noted that germline biallelic inactivations were not addressed in this study. For
SMAD4, homozygous deletion mutations have been identified in invasive ductal carcinomas and it still remains a possibility that biallelic inactivation due to germline homozygous deletions could be playing a significant role in tumorigenesis. This possibility is currently under investigation.
Gene expression in peripheral blood cells has been shown to be altered in early breast cancer but not healthy controls [
54,
55]. To determine whether any of the variants are associated with altered expression levels we also performed expression analysis in the same sample set. Interpreting how changes in expression of
SMAD3 and
SMAD4 affect their activities in the cell may distinguish their roles as a tumor suppressor or oncogene in breast cancer susceptibility.
There is strong evidence for tumor suppressor function of SMAD3 as its loss is associated with tumorigenesis in various cancers [
8‐
10]. However, our qPCR analysis showed that mRNA from breast cancer cases was significantly highly expressed relative to both control groups (BC vs. CO;
P < 0.05, t-test) but was not due to the variants found in the breast cancer cases. Thus, this observation is likely attributable to regulatory factors beyond the MH2 domain. These results, together with the lack of inactivating mutations from this study and COSMIC database, strongly support that SMAD3 is not functioning as a direct tumor suppressor in breast cancer. Nevertheless the abnormally high levels of germline expression as well as statistically significant over-expression of
SMAD3 in invasive ductal carcinoma (IDC) compared to normal tissues raises the possibility that epistatic interactions of SMAD3 may contribute to the oncogenic activities of TGF-ß. SMAD3 has been shown to counteract BRCA1-dependent DNA repair in response to DNA damaging agents and over-expression of SMAD3 decreases BRCA1-dependent cell survival [
12]. Therefore, it is possible that such high levels of germline
SMAD3 expression may mimic a BRCA1-deficient phenotype. Furthermore, the aberrant expression may be a mechanism that reconciles the allelic imbalance often associated with the 15q21 locus in breast cancer [
11] with the apparent lack of
SMAD3 inactivating mutations.
Loss of expression and allelic imbalance at the
SMAD4 locus has been shown to promote carcinogenesis of gastric, ovarian, and colorectal cancers [
18,
47,
48]. Overall, in our study SMAD4 cases were not differently expressed compared to controls and the variants predicted to create cryptic sites or abolish branch site did not result in aberrant expression. Interestingly, however, the breast cancer case (P9) harboring the novel c.1350G > A variant in exon 10 of
SMAD4, predicted to affect ESEs, had a significant expression increase by almost five-fold that was not observed in any other samples examined, indicating that the full length transcript is preferentially over-produced. Increasing SMAD4 germline expression is unlikely to predispose to breast cancer due to its important role as a tumor suppressor suggesting that SMAD4 is not involved in susceptibility. However, it is appreciated that as tumorigenesis develops the cell becomes increasingly desensitized to the anti-proliferative effects of TGF-β but remains susceptible to its oncogenic properties. Therefore, c.1350G > A could represent a potential prognostic marker as SMAD4 expression has been shown to be an important mediator in the development of osteolytic bone metastasis in late cancer stages but is not required in its maintenance or progression [
56,
57]. This is consistent with the fact that although SMAD4 mRNA levels and protein expression appear to be decreased in breast cancer relative to normal tissues [
58] they are not significantly correlated with tumor size, metastases, nodal status, histological grade, histological type, or estrogen receptor expression. In fact, there was a trend toward longer survival times in patients with SMAD4 negative tumors [
58] and a loss of expression is also correlated with a decrease in axillary lymph node metastasis [
59]. Thus, the results presented here highlight a potential value for evaluating coding variants that affect ESE/ESS for abnormal expression even if they do not influence splicing.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
ET contributed to study design and led the mutation screening and data analysis, and drafted the manuscript. II contributed to data analysis, statistical analyses and helped to draft the manuscript. LB contributed to data analysis and statistical analyses, while JK was responsible for subjects ascertained through the Breast Cancer Family Registry, and helped to revise the manuscript. IL was responsible for subjects ascertained through the Breast Cancer Family Registry and helped to revise the manuscript. HO contributed to study design, the data analysis, and drafting of the manuscript. All authors have read and approved the final version of the manuscript.