Introduction
Breast cancer (BC) constitutes a heterogeneous group of lesions with differences in clinical presentation, pathological features and biological behavior. Amplification and overexpression of the human epidermal growth factor receptor 2 (
HER2) (HER2/
neu, ERBB2) oncogene occur in 15 to 25% of invasive BC [
1,
2] and define a clinically important subgroup (HER2+). Patients with HER2+ BC have traditionally been associated with poor prognosis [
1,
3]; however, the advent of HER2-targeted therapies has changed the natural course of the disease for many patients, representing one of the success stories of modern oncology. Unfortunately, not all patients with HER2+ disease benefit from targeted treatment, and some develop treatment resistance over time. It has become evident through microarray-based studies that BC with genomic amplification of
HER2 (HER2-amplified) constitutes a biologically heterogeneous subgroup of tumors regarding both gene expression patterns and copy number alterations (CNAs) [
4,
5]. Such genomic profiles have predominantly been obtained from array comparative genomic hybridization (aCGH) [
5‐
7], but more recently single nucleotide polymorphism (SNP) microarrays have become increasingly used, allowing simultaneous detection of both CNAs and allelic imbalance (AI) [
8‐
11]. However, due to disease and data complexity, CNA information has so far mostly been extracted from SNP array data and only recently have robust analysis methods emerged capable of detecting and integrating CNAs and AI [
10,
12‐
14]. Consequently, HER2-amplified BC has not yet been thoroughly investigated in this respect. We, therefore, analyzed assembled BC data from different repositories and by integrating these results with our previous study comprising 200 HER2-amplified tumors [
5], we were able to define a core set of significant CNAs and recurrent amplifications. Furthermore, using a combination of bioinformatical methods for SNP arrays and quantitative DNA flow cytometry (FCM) we delineated the patterns of loss of heterozygosity (LOH), copy number neutral allelic imbalance (CNN-AI), tumor ploidy, tumor subclonality and occurrence of monoallelic gene amplification. Data from HER2-amplified tumors were compared to data from other subgroups of BC, shedding light on a complex landscape of genomic alterations in a clinically important disease entity.
Discussion
HER2+ BC represents an important clinical subgroup of the disease due to availability of effective targeted therapy both in the adjuvant and metastatic setting. Clinically, the subgroup is defined by
HER2 gene amplification and/or protein overexpression; however, genome-wide molecular analyses have shown that BC with genomic amplification of HER2 (HER2-amplified BC) is heterogeneous with regards to gene expression patterns, CNAs and outcome [
4,
5,
20]. Thus, further characterization of HER2-amplified tumors at the gene level may have implications for improved diagnosis, prognosis and prediction.
Here we report the first integrated analysis of CNAs and AI in a large cohort of HER2-amplified BC profiled by high-density genomic microarrays, allowing a comprehensive description of the genomic landscape of CNAs, amplifications, LOH and CNN-AI. When comparing results to our previous study of 200 HER2-amplified tumors profiled by BAC aCGH [
5], we corroborated several previous findings regarding, for example, amplifications, and found a striking similarity in the overall pattern of CN gain and loss. By comparing significant CNAs identified by GISTIC analysis in the current and former study [
5], we were able to define a core set of genomic regions commonly affected by CN gain and loss in HER2-amplified BC across different genomic microarray platforms that may serve as a list of potential targets for further studies (Additional file
4). Differences between the two studies may be explained by usage of different array platforms, data analysis methods and cohort composition. Importantly, however, the concordance between our two studies emphasizes that evaluation of CNAs in a heterogeneous subgroup such as HER2-amplified BC needs to be performed in large sample sets in order to pinpoint recurrent alterations.
Genome-wide analyses of LOH, CNN-AI, tumor ploidy, fraction of aberrant cells and subclonal CN events utilizing genomic microarrays have been scarce in BC due to the often high sample complexity, lack of appropriate analysis methods and low sample numbers. In the current study, we applied GAP [
13] to SNP array data in combination with conventional DNA-FCM to analyze genomic alterations on an allele specific level, patterns of tumor ploidy, tumor subclonality and fraction of aberrant cells in a large set HER2-amplified and HER2-negative tumors stratified by molecular subtype. In HER2-amplified cases as well as HER2-negative subgroups the pattern of LOH was, as could be expected from the LOH definition, strongly associated with the pattern of CN loss (Figures
1 and
2, Additional file
7). In contrast, CNN-AI events were more evenly distributed across chromosomes in HER2-amplified tumors, seldom exceeding > 20% in frequency and not targeting specific genomic regions (Figure
2). Interestingly, a similar low and evenly distributed CNN-AI pattern was also observed in HER2-negative luminal A, luminal B and normal-like tumors (Additional file
7). In contrast, basal-like tumors showed slightly higher frequencies potentially explained by a higher frequency of triploid cases (3N). This suggests that CNN-AI appears as a less frequent genome-wide additive event in the majority of breast cancers. Moreover, in relation to other BC subtypes the patterns of LOH and CNN-AI were similar to findings by Van Loo
et al. [
10], and also mimicked the general pattern of CN-FGA reported for BC gene expression subtypes [
5,
36]. However, based on our joint analysis of 407 HER2-amplified and HER2-negative cases, we were not able to corroborate previously reported subtype specific pattern of aberrant cell estimates [
10] (Figure
4D). This discrepancy between studies warrants further investigation, but indicates that these types of estimations may be difficult to systematically reproduce. Interestingly, the finding in the current study that HER2-amplified and predominantly ER-negative basal-like tumors show lower aberrant cell estimates is consistent with observations of considerable lymfocytic infiltration in these subtypes [
43‐
45]. In agreement with observations in lung cancer [
42] we found that amplifications in HER2-amplified BC were essentially monoallelic, as amplification preferentially targeted one of the two parental chromosomes (Figure
6). This form of amplification may be a mechanism for targeting activating oncogene mutations and has previously been observed on an individual gene level [
46,
47]. The full significance of this putative mechanism, however, remains to be investigated in more detail using, for example, rapidly evolving sequencing techniques.
Aneuploidization is one of the most common properties of cancer and has generally been associated with worse prognosis and more advanced disease [
48]. In support of an overall higher genomic complexity for aneuploid BC, we found that increasing GAP-ploidy was associated with higher fractions of LOH, CNN-AI and CNAs, as well as higher occurrence of subclonal CN loss events irrespective of BC subtype (Figure
2F and Additional file
8). Not surprisingly, the patterns of DNA ploidy, subclonal CN events, fractions of LOH, CNAs and CNN-AI across HER2-amplified and HER2-negative tumors appear consistent with the overall prognosis for the subgroups. For instance, luminal A and normal-like tumors, which generally display the best outcome, are more frequently diploid and less complex. In contrast, basal-like, HER2-amplified and luminal B cases display more complex patterns in line with their poorer outcome and often higher stage [
24,
36]. Although both GAP and a similar method termed Allele-Specific Copy number Analysis of Tumors [
10] allow estimation of
in silico tumor ploidy from SNP array data, both methods have difficulties in analyzing certain types of samples [
10,
13]. To get a more unbiased analysis of the pattern of DNA ploidy across BC subtypes, we used quantitative DNA FCM data for 338 unrelated BCs also analyzed by gene expression microarrays and BAC aCGH. Using this large sample set we were able to corroborate several findings by Van Loo
et al. [
10], as well as results from our GAP analysis, for example, showing that the molecular BC subtypes display different patterns of tumor DNA ploidy. Shifts between FCM and GAP-ploidy peak positions, exemplified by HER2-negative basal-like tumors (Figures
4A and S5A in Additional file
9), may be explained by that the latter estimation aims to account for normal cell contamination, while the former represents a mere total DNA summarization.
Interestingly, the bimodal distribution of tumor ploidy displayed by HER2-negative basal-like tumors was also observed in ER-negative HER2-amplified tumors, and in ER-negative tumors in general irrespective of subtype (data not shown). These findings imply that the evolutionary hypothesis for basal-like tumors suggested by Van Loo
et al. [
10], of a reduction from a diploid to a partial haploid state followed by whole-genome duplication, is not limited to a specific molecular subtype but appears to be more general for ER-negative BC. This apparently more general difference in DNA ploidy patterns between ER-positive and negative BC most likely explain differences in LOH and CNN-AI fractions observed between subgroups/subtypes of HER2-amplified BC as, for example, ER-negative tumors are overrepresented in the HER2-enriched subtype. The HER2-enriched subtype has been found to often comprise the majority of HER2-amplified cases in gene expression studies. However, based on findings from several recent studies, including the current one, it appears clear that 1) the HER2-enriched subtype identified by different single sample predictors is not synonymous with the clinically defined HER2+ subgroup, 2) the subtype includes a notable fraction of HER2-amplified ER-positive cases, 3) HER2-amplified cases are found in all gene expression subtypes at varying frequencies, and 4) HER2-negative cases are found in the HER2-enriched subtype [
5,
20,
24,
25]. As an example of the latter, we found that 7.5% of samples in the 346-sample HER2-negative SNP reference set were classified as HER2-enriched by the PAM50 single sample predictor.
In summary, the comprehensive analysis presented herein confirms and extends several findings about the reported molecular subtypes of BC, but also emphasizes the strong association of different types of genomic aberrations with tumor DNA aneuploidy, irrespective of subtype. The molecular BC subtypes have repeatedly been shown to display different CNAs [
36,
49,
50], and, lately, also differences in fractions of LOH and CNN-AI [
10]. We demonstrate that tumors harboring few CNAs typically also display less LOH, less CNN-AI, lower tumor ploidy and less frequent occurrence of subclonal events, pointing towards an overall lower complexity irrespective of subtype.
Competing interests
JS and ÅB have received honoraria from Roche. The other authors declare that they have no competing interests.
Authors' contributions
JS conceived of the study and performed microarray data analysis with support by GJ and MR. BB performed FCM analysis. JS wrote the manuscript with the assistance of GJ, MR, BB and ÅB. All authors read and approved the final manuscript.