Introduction
Breast cancer is a heterogeneous disease, both biologically and clinically. The molecular background behind breast cancer progression is not well understood, but is associated with the accumulation of genetic aberrations, leading to widespread gene-expression changes in breast tumor cells. Consistent with this is the presence of at least four major breast cancer subtypes with distinct expression patterns and clinical outcomes. These subtypes are termed basal-like, ERBB2+, luminal B, and luminal A [
1,
2]. Basal-like and ERBB2+ subtypes are hormone-receptor negative and have poor prognoses. In contrast, luminal breast cancers are characterized by the expression of ER-associated genes, with luminal B tumors having poorer outcomes than luminal A tumors. Although this gene expression-based approach has proven to add significant prognostic and predictive value to pathologic staging, histologic grade, and standard clinical molecular markers [
3], the high cost of expression profiling and the molecular instability of the mRNA transcripts has limited its incorporation into clinical settings, and therefore, expression-based breast cancer classification has not become a standardized method in the general practice [
4]. Thus, although breast cancer stratification by gene expression is still considered the gold standard, an urgent need exists for well-defined biomarker panels allowing breast cancer subtyping in clinical diagnostics.
Epigenetic alterations such as aberrations in DNA methylation, microRNA patterns, and post-translational modifications of histones are common molecular abnormalities in cancer [
5]. Furthermore, many studies suggest that epigenetic changes are involved in the earliest phases of tumorigenesis, and that they may predispose stem/progenitor cells to subsequent genetic and epigenetic changes involved in tumor promotion [
6]. Cancer-related disruption of the DNA methylome involves global genomic hypomethylation and regional hypermethylation of cytosine-phosphate-guanine (CpG) islands. The first can lead to chromosomal instability [
7], whereas the second is frequently associated with promoters of tumor-suppressor genes, resulting in their transcriptional silencing. Both such alterations have been frequently observed in breast cancer [
8‐
19].
Analogous to transcriptomic profiling, DNA methylation profiling is considered to allow the molecular classification of human malignancies and monitoring cancer progression based on a tumor-specific methylation signature [
20]. At the same time, it facilitates biomarker discovery for the clinical implementation of this process.
Previous epigenetic analyses have identified aberrant DNA methylation signatures associated with molecular subtypes of breast cancer through hormone-receptor and human epidermal growth factor 2 (HER2) status [
21‐
23]; however, very limited information is available on global methylation changes associated with each molecular subtype, as previous studies focused on individual candidate tumor-suppressor genes by using locus-specific methods. Here we have applied an array-based method [
24] for comprehensive DNA methylation profiling to identify differentially methylated genes in breast cancer molecular subtypes. This approach efficiently recognized 15 differentially DNA-methylated loci, which were further validated through pyrosequencing in an independent cohort encompassing 47 basal-like, 44 ERBB2+ overexpressing, 48 luminal A, and 48 luminal B paired breast cancer/adjacent tissues. Our results provide strong evidence for the existence of tumor subtype-specific aberrant methylation profiles, which might be inducers of some transcriptional changes taking place in breast cancer molecular subtypes.
Materials and methods
Patients and tumor characteristics
Samples and associated clinicopathologic data were obtained from the Anatomy Pathological Services of the Txagorritxu Hospital, Oncologic Institute of Donostia, and Donostia Hospital (Basque Country). Samples of breast tumor and corresponding adjacent normal-appearing tissue (located at least 2 cm away from the site at which the tumor was sampled) were collected from 215 patients diagnosed with a ductal infiltrative breast carcinoma.
DNA-methylation measurements were performed on DNA isolated from paraffin-embedded primary breast cancer. All breast specimens were reviewed by experienced pathologists. The inclusion criteria were the availability of the paraffin-embedded tissue, tumor size between 1 and 3 cm, histologic grade between 1 and 3, and estrogen receptor (ER)-, progesterone receptor (PR)-, HER2-, CK5/6+, or CK14+, or EFGR+ for the basal-like tumors, ER+, PR±, and HER2+ for the Luminal B, ER-, PR-, and HER2+ for the ERBB2+ tumor group, and ER+, PR+, and HER2- for the Luminal A. Additional data such as Ki-67 status, p53 mutation, and nodal involvement were also registered. Ethical approval for the study was obtained from the corresponding Ethics Committees of the Institutions involved.
To minimize contamination in the methylation analysis, we isolated breast cancer cells and paired normal breast epithelial cells from tissues by manual macrodissection. In brief, 10-μm sections were cut from each archival formalin-fixed paraffin-embedded (FFPE) tissue block. For each pair of tissues, the presence of tumor cells in malignant tissues and the absence of cancer cells in normal tissues were confirmed by histopathologic examination.
A total of 500 μl of buffer TE pH 9 was added to each sample and heated at 100°C for 20 minutes by using a water bath. After heating, a cooling step of 5 minutes was allowed before adding 20 μl of proteinase K. Samples were incubated at 56°C overnight until all the tissue fragments were completely dissolved. Subsequent extraction and purification procedures were performed after the next steps: addition of 500 μl phenol/chloroform/isopropanol alcohol (25:24:1) to the digested tissue, followed by mixing for 10 minutes and centrifugation at 12,000 rpm for 10 minutes. The supernatant fluid was removed to an autoclaved microtube, and one volume of chloroform/isopropanol alcohol (25:1) was added, mixed by vortexing, and centrifuged at 12,000 rpm for 10 minutes. The upper aqueous supernatant was purified by using a DNA purification kit, following the manufacturer's instructions (Qiagen, Valencia, Spain), and the final yield of DNA was dissolved in 50 μl of buffer TE. Sodium bisulfite modification of 1.5 μg DNA was done with the EZ DNA methylation kit (Zymo Research, Orange, CA, USA) by following the manufacturer's protocol.
Marker discovery study
Illumina GoldenGate Methylation Cancer Panel 1
The Illumina GoldenGate Methylation Cancer Panel was used to analyze 550 ng of starting bisulfite-modified genomic DNA. Methylation was represented as a continuous value from 0 (completely unmethylated) to 1 (completely methylated). This value is calculated by subtracting background hybridization levels obtained from negative control probes on the array and calculating the ratio of the fluorescent signal from the methylated allele (M) to the sum of the fluorescent signals from both unmethylated (U) and methylated alleles (|U|+|M|+100).
Differential methylation analysis
Microarray data were analyzed to identify the most significant tumor subtype-specific changes relative to the adjacent tissue. Differential methylation was assessed by comparing the mean methylation level (b-value) of samples with the mean b-value of the corresponding adjacent tissue by using BeadStudio (San Diego, CA, USA) and Qlucore Omics Explorer 2.0 (Qlucore AB, Lund, Sweden) software. Selection of the most significantly differentially methylated loci in each tumor subtype was based on (a) Δb value difference of at least 0.20 between the tumors and reference group; (b) an FRD-corrected
P value cut off of
P < 0.001, as determined by a two-tailed
t test [
25]; and (c) a
P value < 0.05 when comparing mean methylation values among the studied tumor subtypes by using analysis of variance (ANOVA) test with Benjamini-Hochberg FDR multiple testing correction. The methylation data have been deposited in NCBI's Gene Expression Omnibus (GEO) [
26] and are accessible through GEO Series accession number [GEO:GSE22135].
Validation of the data by pyrosequencing
PCR and pyrosequencing reactions
Selected markers from microarray data analysis were further validated in a larger sample size by bisulfite/pyrosequencing. Additionally, four candidates from the literature were included:
LINE-1, to measure the global hypomethylation of the tumors [
27], and
Let-7a,
Mir-10a, and
Mir-93 microRNAs, because they have been reported to be differently expressed in breast cancer molecular subtypes [
28], and their expression might be regulated by methylation in some tumor subtypes, and thus provide new potential subtype-specific biomarkers.
For the methylation analysis, 1.5 μg of genomic DNA was treated with sodium bisulfite, by using the EZ DNA methylation Kit (Zymo Research, Orange, CA, USA) according to the manufacturer's protocol. All primers were designed by using the Assay Design Software (Biotage, Uppsala, Sweden) and synthesized by MWG (Ebersberg, Germany). PCR amplifications were performed by using Qiagen HotStarTaq Master Mix Kit (Qiagen, Valencia, Spain), 7.5 μ
M biotinylated primer, 15 μ
M nonbiotinylated primer, and 2 μl of bisulfite-treated DNA (60 ng). PCR primer sequences, PCR conditions, and sequencing primer sequences are given in Table S1 in Additional file
1. The quality and quantity of the PCR product was confirmed by agarose gel (2%) electrophoresis before the cleanup and pyrosequencing analysis. Pyrosequencing was carried out by using the SQA kit (Biotage, Uppsala, Sweden) on a PSQ 96MA Pyrosequencer (Biotage), and the methylation index was calculated by using the Pyro Q-CpG software (Biotage).
Validation data analysis
Methylation status in tumor versus adjacent tissue
Methylation status was assessed at the studied markers, as previously described by Feng
et al. [
21]. Taking advantage of paired normal/tumor samples, normal tissues' value was considered as the reference. If using the pooled normal samples' mean plus twice the standard deviation as a cut-off point (minimum, 10%), we estimated the probability of the methylation level for a normal-appearing tissue being lower than the cut-off point is <96%. Thus, it is reasonable to assume that samples with a methylation value larger than the cut-off point are likely to be abnormal (or positive). A paired
t test was used to determine whether a statistically significant change was present in the methylation of the markers examined between the tumors and adjacent tissues. Additionally, to allow the assessment of the observed methylation at multiple promoters as a continuous variable, Z-score analysis was used [
29,
30]. A Z score for each gene was calculated by using the given formula:
Mean of CpG methylation density of the assessed promoter for each sample - Mean of methylation density for the tumor panel)/SD of methylation density.
A mean Z score was calculated by integrating the promoter-specific Z scores and used as a simple score characterizing mean methylation density. In this analysis, a Z score >0 means methylation greater than the population mean.
Comparison between methylation status, breast cancer molecular subtypes, and clinicopathologic characteristic
A one-sample Kolmogorov-Smirnov test was used to evaluate fitness to normal distribution of continuous parameters. Differences in promoter methylation among tumor subtypes were analyzed with ANOVA or Kruskal-Wallis tests as appropriate. The Wilcoxon signed-rank test was used to compare methylation in paired samples. If differences between two independent groups or clinicopathologic characteristics had to be considered, a parametric test (Student t test) or nonparametric test (Mann-Whitney U test) was used. Comparisons of categoric variables were made by using Fisher's Exact and Pearson's χ2 tests. All reported P values are two-tailed and considered statistically significant if P < 0.05.
Subtype classification
Multivariate logistic regression (MLR) analysis was performed on those biomarkers showing significance in univariate analysis to identify potential biomarker panels capable of discriminating breast cancer subtypes with the best sensitivity and specificity. Models including all possible combinations were constructed and tested by Mallows' Cp selection criterion. The false discovery rate (FDR) of classifying breast cancer subtypes was determined in the best models, and we selected those significant at the FDR <0.2 level.
Supervised hierarchic clustering based on genes selected in the models was performed by using an ANOVA test (Benjamini-Hochberg FDR multiple testing corrected [
25]) to confirm results obtained by MLR (Qlucore Omics Explorer 2.0; Qlucore AB, Lund, Sweden). DNA methylation profiles were standardized to have a mean of zero and a standard deviation of 1, and clustering was performed by using the euclidean method and average linkage.
Ethical considerations
The present study involved analysis of DNA from archival tissue with no subject intervention. No identities were linked to subject records. This study was approved by the Txagorritxu Hospital Review Board under the category of exempt status, and no consent form was required from the participants.
Discussion
Identification of gene expression-based breast cancer subtypes is considered a critical means of prognostication, and furthermore, an important predictive marker for the response to treatment with endocrine therapy. However, analytic tests relying on RNA measures are difficult to standardize and implement because of the instability of the mRNA transcripts. DNA-methylation profiles reflect phenotypically important differences in gene transcription and, in contrast to most mRNAs, a very stable structure, making DNA-methylation profile-based diagnostic tests highly accurate and reproducible [
31]. By use of two independent cohorts of invasive breast carcinomas, our study is, to our knowledge, one of the first to deliver specific methylation profiles associated with basal-like, ERBB2+, luminal A, and luminal B molecular subtypes of breast cancer.
Statistical analysis of the methylation microarray data revealed extensive DNA-methylation changes between tumor and adjacent tissue (Figure
1, Table S2 in Additional file
2) and concurrence of some of these DNA-methylation changes with particular breast cancer molecular subtypes and tumor morphologies. Applying stringent criteria, we identified significant differential methylation among the tumor subtypes in 15 of the 1,505 screened CpG islands. Subsequent validation of the selected genes plus four more from the literature by pyrosequencing in an independent cohort of 187 breast tumor/normal pairs, confirmed that genes tested have significantly altered methylation profiles in comparison to normal-appearing adjacent tissue, and that most of the candidate genes (15 of 19) displayed significantly different methylation profiles between different tumor subtypes. These results corroborate the potential usefulness of the studied markers for the development of a methylation marker panel defining tumor subtypes. Moreover, the current study represents the first evidence for DNA hypermethylation of
NPY, FGF2, CD40, TAL1, JAK3, SPARC, PRKCDBP, DBC1, SOX1, TNFRS10D, Let-7a, Mir-10a, and
Mir-93 and hypomethylation of
VAMP8 in breast cancer.
Several genes have been previously reported to be aberrantly methylated in breast cancer (reviewed in [
32‐
34]). Furthermore, methylation in breast cancer has already been connected to breast cancer molecular subtypes, but these observations require further confirmation. As previously suggested, we found that basal-like, ERBB2+, luminal A, and luminal B molecular subtypes displayed specific methylation profiles. Specifically, HER2-enriched breast tumors (ERBB2+ and luminal B) were associated with the hypermethylation of several genes related to cancer development (Table
3). These results are in accordance with previous studies stating that HER2/neu breast cancers are associated with preferential hypermethylation of several genes [
21‐
23]. In addition, Terada
et al. [
35] recently found that frequent CpG islands methylation is highly associated with HER2 amplification. Conversely, we observed that basal-like tumors were inversely related to promoter methylation of many of the studied genes (Table
3) and showed the lowest methylation levels among the studied subtypes, as indicated by the mean Z-score values (Figure
3c). These findings are consistent with the recent observations made by Holm
et al. [
36], who reported that basal-like tumors have low methylation levels of several CpG sites, whereas luminal B tumors display high methylation levels.
Additionally, our efforts to identify and validate breast cancer subtype-specific epigenotypes resulted in a significant model based on five biomarkers, which is capable of discriminating basal-like and HER2-overexpressing subtypes. Basal-like tumors showed lack of methylation at NPY, FGF2, HS3ST2, RASSF1, and Let-7a markers, whereas HER2-overexpressing tumors (luminal B and ERBB2+ subtypes) were related to hypermethylation of these markers.
Several authors have speculated that these genomically defined tumor subtypes may represent transformation of stem cells. Some of these hypotheses suggest that mammary stem cells progress to a luminal progenitor state (with an expression pattern similar to that identified in basal breast cancer), which then progresses to differentiated cells with more luminal characteristics [
37]. As can be deduced, this hypothesis locates basal-like tumors in an intermediate differentiation step between the mammary stem cells and the more-differentiated luminal subtypes, which would explain their poor prognosis despite response to chemotherapy [
38]. Many studies have been conducted to define methylation profiles in human stem cells [
39‐
42]. Likewise, Calvanesse
et al. [
43] compared the methylation status in human embryonic stem cells (hESCs), cancer cell lines, and normal human primary tissues by using the same platform as that in our study. Their finding was consistent with the view that genes aberrantly hypermethylated in cancer (that is, not hypermethylated in normal tissues) were not hypermethylated in hESCs, termed by these authors the classic class A cancer hypermethylated genes. Interestingly, all the CpG sites related to basal-like tumors in univariate regression analysis, with the exception of RASSF1 (Table
3), belonged to this committed class A category, and in a manner analogous to hESCs; they were not hypermethylated in basal-like tumors, nor in normal breast, but they were hypermethylated in the rest of breast cancer subtypes, suggesting that basal-like tumors may share some similarities in their methylation patterns with those of the hESC cell lines. These shared methylation signatures may reinforce the hypothesis of basal-like tumors arising from a mammary stem cell and progressing to differentiated cells with more luminal characteristics (HER2 enriched, luminal A and B). A recent study by Holm
et al. [
36] also supported this hypothesis; they observed that basal-like tumors seem to arise from luminal progenitors in which genes initiating a differentiated luminal cell fate are repressed by other mechanisms than promoter methylation, such as the Polycomb repressive complex 2 (PRC2).
Finally, epigenetic therapy, including the use of demethylating agents and histone deacetylase inhibitors, is now in clinical trials for myelodysplastic syndrome, leukemia, and ovarian and lung cancers [
21]. Recent studies have suggested that co-treatment of DNA-methylation inhibitors and histone deceatylases might be an effective form of epigenetic therapy for breast cancer, as the interplay observed between DNA methylation and histone modifications can result in synergistic induction of tumor-suppressor genes [
21,
44]. Thus, a possibility exists that epigenetic therapy could play an important role in the immediate future of breast cancer treatment. The information on subtype-specific methylation profiles, described in the present study, might promote a better understanding of the epigenetic regulation mechanisms in breast cancer, thereby contributing to the improvement of epigenetic therapy.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
NGB, TL, AFF, and AAS carried out the methylation assays and helped prepare the manuscript. IG, AV, RR, and IR provided tissues and clinical data. MCA and MJA carried out HER2, CH5/6, CK14, and EGFR1 assays and provided pathologic supports for tissue microdissection. DM and JD performed the statistical analysis. MMP, TL, MFF, MR, and JKF conceived the study and prepared the manuscript. All authors read and approved the final manuscript.