Background
Breast cancer is the most frequent type of cancer in women both in the developed and the developing world [
1]. It is a very heterogeneous disease with regards to its molecular profile [
2], and clinical course, which presents great interpatient variability. Although conventional histopathological characteristics remain the most important prognostic determinants of survival [
3], there is a continuous search for new biomarkers or stage models that could help predicting clinical evolution [
4], or improving therapy selection. In this regard, genetic variations in carcinogenesis-related processes are natural candidates for exploring new prognostic factors or potential targets for specific therapies [
5,
6].
The epidermal growth factor receptor (EGFR) is a transmembrane tyrosine kinase (TK) receptor of the ErbB family, whose activation leads to mitogenic signaling [
7]. EGFR is frequently overexpressed in many tumors, including breast cancer, and its activation contributes to unrestricted proliferation, advanced stages of disease, resistance to conventional treatments, and poor prognosis [
8]. Despite the recognition that EGFR overexpression in breast tumors may affect disease progression [
8], the responses of anti-EGFR therapies in breast cancer are not fully satisfactory [
9], and the reasons for this clinical variability are not fully understood.
The
EGFR gene, located at 7p12.3-p.1, contains multiple polymorphisms [
10], two of which are recognized for their functional effects: a dinucleotide (CA)n repeat sequence polymorphism in intron 1 (rs72554020) affects gene transcription [
11], and appears to modulate EGFR expression in breast tumors [
12], and a single nucleotide change (G → A) in exon 13 leads to an Arginine (Arg) → Lysine (Lys) substitution in codon 497 (rs11543848), resulting in attenuated TK activity, with consequent reductions in ligand binding, growth stimulation, and induction of proto-oncogenes
myc,
fos, and
jun[
13].
In the present work, we aimed to describe the frequency of these two EGFR polymorphisms among Brazilian breast cancer patients, and to evaluate their impact on breast cancer prognosis, exploring the effects of (CA)n polymorphism on EGFR transcript levels, and the associations of both polymorphisms with histopathological features and prognostic estimates.
Materials and methods
Subjects and study design
The study population consisted of a prospective cohort of Brazilian women with first diagnosis of unilateral breast cancer and no distant metastases, admitted at the Brazilian National Cancer Institute (INCA) during the period from February 2009 to April 2011, and who were assigned for tumor resection as their first therapeutic approach. The recruitment occurred before surgery, but the inclusion was only completed after diagnosis confirmation by histopathological evaluation of the resected tumor. The study protocol was approved by the Ethics Committee of the Brazilian National Cancer Institute (INCA #129/08), and all patients gave written consent to participate. The REMARK guidelines (REporting recommendations for tumor MARKer prognostic studies) were followed [
14].
Histopathological characterization
The histopathological evaluation of resected tumors was performed following institutional routine procedures, and all individual data were obtained from electronic medical records. The histopathological characterization was based on the TNM classification by the American Joint Committee on Cancer [
15] and on the Elston Ellis histological grading system [
16].
The data on hormone receptors, i.e. Estrogen Receptor (ER), and Progesterone Receptor (PR), and on the Human Epidermal growth factor Receptor 2 (HER2) status were used for biological classification of the tumors, as proposed by Huober
et al. [
17]. The Estimated Recurrence Risk (ERR) was inferred by a combination of all histopathological features, as proposed by the Early Breast Cancer Trialists’ Collaborative Group [
18], with the following categories: “Low Risk”, characterized by the presence of [age ≥ 35 years, N0 (absence of tumor cells in lymph nodes), G1 (histological grade 1), T1 (tumor size lower than 2 cm), (ER+ or PR+), HER2-], and absence of peritumoral vascular invasion; “Intermediate Risk”, characterized by N0 in the presence of [age < 35 years, or T ≥ 2, or G ≥ 2, or (ER- and PR-), or HER2+], or by N1 (presence of tumor cells in 1 to 3 lymph nodes) in the presence of [HER2-, and (ER + or PR+)]; and “High Risk”, characterized by N1 in the presence of [HER2+, or (ER- and PR-)], or by N ≥ 2 (presence of tumor cells in more than 3 lymph nodes).
Genotyping analyses
Peripheral blood samples (3 mL) were collected from the subjects, and DNA was extracted using the Blood Genomic Prep Mini Spin Kit (GE Heathcare, Buckinghamshire, UK), following the procedures recommended by the manufacturer. The genotyping analyses were performed using PCR-RFLP for the SNP R497K (rs11543848) or by capillary electrophoresis for the (CA)n repeat polymorphism in intron 1 (rs72554020). The PCR amplifications were performed with the following primers (Life Thechnology, Carlsbad, CA, USA): 5′-AGGTCTGCCATGCCTTGT-3′ (sense) and 5′-CAACGCAAGGGGATTAAAGA-3′ (antisense) for R497K; or 5′-TTCTCCTCAAAACCCGGAGAC-3′ labeled with 6-FAM™ (sense) and 5′-GTCACGAAGCCAGACTCGCT-3′ for (CA)n repeat (antisense). The R497K PCR products (5 μL) were digested with 5U of BstN1 restriction enzyme (New England BioLabs, Northbrook, IL, USA) at 60°C for 3 hours, and the digestion products were resolved on 2% agarose gel and stained with ethidium bromide for visualization under UV light. The digestion of the homozygous G alleles (Arginine) produced two fragments (100 bp and 56 bp), whereas the homozygous A alleles (Lysine) remained intact (156 bp). The method was validated by direct sequencing of four samples of each genotype.
The (CA)n repeat PCR products (0.5 μl) were denatured at 95°C for 3 min in the presence of 0.5 μl of the GeneScan™ 400HD ROX molecular weight standard (Applied Biosystems, Foster City, CA, USA) and 9.0 μl of Hi-Di™ Formamide (Life Thechnology, Carlsbad, CA, USA), refrigerated to 4°C for 2 min, and then submitted to separation by capillary electrophoresis in ABI Prism® 3130 Genetic Analyzer, using POP7™ polymer (Applied Biosystems, Foster City, CA, USA). The analyses were performed using the GeneMapper® Software v.3.7 (Applied Biosystems, Foster City, CA, USA). The PCR products identified as homozygous, i.e. those presenting a single retention time at the capillary electrophoresis, were submitted to direct sequencing, using the BigDye® Terminator Kit (Applied Biosystems, Foster City, CA, USA), in order to establish a correspondence between each retention time and the respective number of CA repeats (or allele length).
Quantification of EGFR mRNA
Fresh specimens of breast tumors were dissected by clinical pathologists after tumor resection, frozen in liquid N2, and stored at the Brazilian National Bank of Tumors (BNT-INCA). Frozen sections of breast specimens (with approximately 2 mm) were used for RNA isolation, which was performed using the RNeasy Mini Kit (Qiagen, Valencia, CA, USA), following the manufacturer’s instructions. The RNA samples were stored in RNAse-free distilled water at -80°C, and the corresponding cDNA was synthesized using 2 μg of RNA, with High Capacity cDNA Reverse Transcription Kit (Applied Biosystems, Foster City, CA, USA), according to the manufacturer’s instructions.
The relative quantification of EGFR transcripts was performed using quantitative real-time RT-PCR (TaqMan) assays, in an ABI PRISM 7500 Sequence Detector System (Applied Biosystems, Foster City, CA, USA). Each reaction contained: cDNA templates (approximately 40 ng), 10 μl of reaction mix containing 5 μl Taqman® Gene Expression Master Mix, and Taqman® probes, which were as follows: EGFR Hs01076078_m1 (with FAM), PPIA 4326316E (with VIC) (Applied Biosystems, Foster City, CA, USA). The thermal cycling conditions comprised an initial denaturation step at 95°C for 10 min, followed by 40 cycles of 95°C denaturation for 15 sec, and annealing at 60°C for 1 min. The experiments were carried out in 96-well plates, including a nontemplate control, and a reference control, consisting of cDNA obtained from a commercial Human Mammary Gland (HMG) total RNA (Clontech Laboratories, Mountain View, CA, USA). The relative quantification of EGFR mRNA was calculated as the average 2-ΔΔCt, where ΔΔCT = ΔCT
EGFR
- ΔCTHMG, and ΔCT
EGFR
= Ct
EGFR
- Ct
PPIA
, and ΔCT
HMG
= Ct
HMG
- Ct
PPIA
. All data were generated in triplicates and expressed as median +/- SD with the 25–75 percentiles.
Statistical analyses
A descriptive study of the cohort was conducted, presenting measures of central tendency and dispersion for continuous variables, or relative frequencies for each categorical variable. Allelic and genotypic frequencies were derived by gene counting. The histopathological features were dichotomized for better and worse prognostic values, and their associations with EGFR genotypes were evaluated by the Chi-square or Fisher’s exact tests. In the cases of significant associations between EGFR genotypes and independent histopathological variables, the odds ratios (OR) and their respective 95% confidence intervals (95% CI) were tested for linear-by-linear associations, with calculation of trend significances (Ptrend), and definition of phenotypic inheritance models. The odds ratios between EGFR phenotypic groups and histopathological categorical features were adjusted for all other independent clinical variables (ORadjusted) using multiple regression analyses. The comparison of the relative quantities of EGFR mRNA as a function of histopathological features or EGFR genotypes was performed with the GraphPad Prism 5.0 software (GraphPad Software, La Jolla, CA, USA), using the non-parametric Mann–Whitney U-test for comparison of two groups, or the Kruskal-Wallis test for comparison of multiple groups. All other statistical analyses were conducted using SPSS 13.0 for Windows (SPSS Inc., Chicago, Illinois). The threshold for significance was set at P < 0.05.
Discussion
The distribution of the two
EGFR functional polymorphisms in the Brazilian population was not known before the current study. Our data indicate a frequency of 0.21 (95% CI = 0.19 – 0.24) for the
497 K (
Lys allele), and of 0.43 (95% CI = 0.40 – 0.46) for the
(CA)
16
. These results are similar to the frequencies reported for Europeans and North-Americans (including African-Americans), either for
R497K polymorphisms [
19,
20] or
(CA)n[
12,
21]. Asian populations, however, appear to have higher frequencies of the
Lys allele [
22,
23], and different patterns of
(CA)n alleles [
12,
21,
24,
25].
One difficulty of evaluating the effects of
(CA)n polymorphism in gene transcriptional activity
in vivo is the vast distribution of the number of (CA) repeats, with various possible heterozygous genotypes, and no clear model on how the two alleles interact for the final cell phenotype. Amador
et al. [
26] considered the sum of CA repeats of both alleles and showed an inverse correlation between this combined length and the levels of
EGFR mRNA in head and neck cancer cell lines. Buerger
et al. [
12], studying breast tumors, considered the length of the smaller allele, and showed a non-significant tendency for lower EGFR protein expression with increasing allele length. Accordingly, Buerger
et al. [
27] showed that breast tumors from Japanese patients, who present high frequencies of
(CA)
20
and other long alleles, had lower amounts of EGFR protein than tumors from German patients, who have a predominance of
(CA)
16
and other short alleles. Other authors, however, found no correlation between the length of the (CA)n region and the relative quantification of
EGFR mRNA [
28] or EGFR protein expression [
29].
Our data confirm the great dispersion of (CA) lengths and indicate great variability on the expression of
EGFR mRNA, with no apparent inverse correlation between the number of (CA) repeats, considering either the smaller allele or the combined length within each genotype (data not shown). In order to investigate a possible effect of somatic mutations on the tumoral
(CA)n genotype, we evaluated a set of 40 tumor samples. The number of CA repeats was preserved in relation to genomic DNA in all cases (data not shown). Although we did not extend such analyses to all patients, it appears that mutational events, such as loss of heterozigosity, are not affecting the
EGFR locus of breast tumors. Nevertheless, an accurate characterization of the impact of
EGFR polymorphisms on the gene transcriptional activity
in vivo would ideally include quantification of gene amplification in the tumors [
27]. In addition, there are two other
EGFR polymorphisms (-
216G/T or rs712829 and
-191C/A or rs712830), located in the promoter region, which might have functional impact on
EGFR transcriptional activity [
30]. Finally, epigenetic variations may also interfere with
EGFR expression [
31].
The evaluation of the impact of
EGFR polymorphisms on histopathological and molecular characteristics of breast cancer indicated significant association between
R497K variant genotypes and better lymph node status, corroborating the findings of Kallel
et al. [
32], and between
Long/
Long (CA)n genotypes and positive PR status. These two associations seem protective in relation to breast cancer evolution, since a greater number of affected lymph nodes increases the risk of systemic metastasis [
33], and the lack of PR expression increases the risk of disease progression, especially in post-menopausal women [
34].
With regards to the molecular mechanisms underlying lymph node metastases, EGFR appears to activate integrins [
35] and metaloproteinases [
36], favoring cell differentiation towards an invasive phenotype. The association between the variant allele (
Lys) and better lymph node status appear to corroborate the notion of reduced signaling with the variant EGFR isoform [
13], leading to lower invasiveness, which reinforce the role of EGFR in breast cancer pathogenesis.
The interaction between the EGFR activity and the PR status might occur via a cross-talk mechanism between steroid and growth factor receptors [
37], resulting in activation of the PIK3-Akt-mTOR pathway, which appears to negatively modulate the transcriptional activity of the PR [
38]. This negative modulation of ER-mediated functions in breast cancer via EGFR signaling may underlie the mechanism of resistance to hormone therapy observed in tumors with high EGFR expression [
39]. Taken together, the association between
EGFR polymorphisms and lymph node metastases and negative PR status appear to corroborate the role of EGFR in breast cancer pathogenesis.
The combined presence of
Long/Long (CA)n genotypes and
Lys R497K alleles appears to favor better prognostic estimates in breast cancer. Other studies involving different types of cancer also point to an interaction between the two
EGFR polymorphisms, with a combined protective effect in relation to disease progression. Zhang
et al. [
40], evaluating pelvic recurrence in patients with rectal cancer treated with chemoradiation, showed that the highest risk for local recurrence was seen in patients with the reference genotypes, i.e., both 497
Arg alleles and <20 CA repeats. Bandrés
et al. [
41], studying head and neck cancer, showed that patients with at least one 497
Arg allele and both (CA)n repeats ≤ 16 presented higher risk of death. Press
et al. [
42], studying metastatic colon cancer, found that men with the
Arg/Arg genotype and two short alleles (< 20 CA repeats) had shorter overall survival than men with the
Lys/Lys or
Arg/Lys variant genotypes and any long allele (≥ 20 CA repeats).
The stratification of breast tumors according to their biological subtypes suggests that the apparently protective effects of
EGFR polymorphisms are characteristic of luminal A tumors. This apparently selective effect of
EGFR polymorphisms might be due to the lower genomic instability of luminal A tumors in relation to other subtypes, which present more aggressive phenotypes due to superposed molecular alterations [
43]. Nevertheless, the small number of non-luminal A tumors limits the statistical power of the analyses, and the confidence of this assumption. In addition, the apparently favorable associations of
EGFR polymorphisms with prognostic features at diagnosis cannot be considered as actually predictive of disease progression or therapy response,
Acknowledgements
The authors thank Dr. Guilherme Suarez-Kurtz for the use of laboratory facilities, and the personnel from the Breast Cancer Hospital (HC3-INCA) and from the National Bank of Tumors in the Brazilian National Cancer Institute (BNT-INCA), for logistic support in sample and data collection.
Financial support
This study was supported by grants from Conselho Nacional de Pesquisa e Desenvolvimento (CNPq474522/2010-5), from Fundação Carlos Chagas Filho de Amparo à Pesquisa no Rio de Janeiro (FAPERJ E-26/110356/2010), from Fundação para o Desenvolvimento Científico e Tecnológico em Saúde (FIOTEC/FIOCRUZ; Projeto Inova ENSP), and from INCT para Controle do Câncer (CNPq 573806/2008-0; FAPERJ E26/170.026/2008). MSL, DNP, JSFV, and VI-B received scholarships from Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), MSL received a scholarship from Ministério da Saúde – INCA, and LCG received a scholarship from CNPq.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
MSL recruited patients, collected clinical information, set the genotyping and expression assays, characterized genotypes and haplotypes, performed statistical analyses, generated tables and figures, and drafted the manuscript. LCG recruited patients, collected clinical information, and helped with genotyping assays. DNP set and performed the expression assays, and helped revising the manuscript. JSF-V collected and revised histopathological data, and helped revising the manuscript. VI-do-B recruited patients, collected clinical information, and helped with the statistical analyses. SK conceived the epidemiological design of the cohort. RSMN coordinated the genotyping of (CA)n polymorphism. MAC coordinated the expression assays, collaborated with data interpretation and revised the manuscript. RV-J conceived, designed, and coordinated the study, analyzed the data, wrote and revised the manuscript. All authors read and approved the final manuscript.