Background
Heterogeneity in biomarker expression between tumors is the basis for breast cancer subtyping and precision medicine [
1]. However, intratumoral heterogeneity, often reflecting spatial heterogeneity of biomarker expression within a single tumor, has important implications for accurate tumor classification, and it may impact both epidemiologic research [
2] and clinical decision-making [
3].
Approximately 10–20 % of tumors are found to have disagreement in estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) status upon repeat assay, as assessed by studies examining interlaboratory agreement rates [
4‐
6]. A variety of technical factors contribute to lack of interlaboratory agreement, including differences in antibody or assay type; level of laboratory experience; and tumor sampling, fixation, and storage protocols [
4,
7‐
14]. In addition to these technical factors, repeat assays are commonly carried out using a separate tumor block and therefore may test a different area of the tumor, suggesting that spatial heterogeneity of biomarker expression may also contribute to discordance [
15]. However, the frequency and sources of intratumoral ER, PR, and HER2 heterogeneity have not been evaluated in population-based studies.
Using tissue microarrays (TMAs) comprising two to four tumor cores for each of 1085 cases from the Carolina Breast Cancer Study (CBCS) in the African American Breast Cancer Epidemiology and Risk (AMBER) Consortium, we identified cases with core-to-core discordance in ER, PR, and HER2 status using automated digital image analysis. Discordant cases were manually reviewed to identify technical and biological factors contributing to variability in biomarker expression. We estimated the frequency of intratumoral ER, PR, and HER2 heterogeneity among biomarker-positive cases and evaluated the impact of biomarker discordance on case-level ER, PR, and HER2 status agreement between TMAs and the clinical record.
Discussion
Intratumoral biomarker heterogeneity may pose a challenge for accurate classification of breast cancer, with implications both for clinical decision making and for epidemiologic research. However, the frequency and sources of intratumoral ER, PR, and HER2 heterogeneity have not been well-characterized, particularly in population-based studies. Using TMAs comprising multiple cores per case, we observed that cases with discordant biomarker status between cores by automated digital image analysis had reduced agreement with the clinical record. Manual review of discordant cases revealed that 35–56 % of discordant biomarker status between cores was caused by spatially heterogeneous expression of ER, PR, and HER2, which was observed in 2 %, 7 %, and 8 % of all biomarker-positive cases, respectively.
Our findings demonstrate that automated algorithms cannot reliably distinguish between IHC-stained tumor and nontumor cells. Therefore, admixture of tumor and DCIS and/or benign epithelium can potentially lead to tumor biomarker misclassification by automated analysis if biomarker status is discordant between tumor and nontumor tissues. Synchronous DCIS and invasive cancers typically share tumor characteristics and hormone receptor status [
21]. However, HER2-positive DCIS within an HER2-negative invasive tumor has been observed [
22], and this may pose a challenge for the use of digital algorithms to properly classify the HER2 status of invasive carcinomas. In addition, admixed benign epithelium, which often expresses both ER and PR, can produce false positivity in hormone receptor-negative tumors. However, we previously showed that computing average biomarker expression across cores after weighting cores by tumor cellularity diminishes the influence of small discordant regions and produces high agreement (≥88 % for all biomarkers) with the clinical record.
Intratumoral ER heterogeneity has previously been suggested to be a rare phenomenon [
23], although the frequency in a population-based setting has not been established. Using an automated approach to identify cases with discordant ER status between cores, followed by manual review, we observed intratumoral heterogeneity of ER expression in 2 % of all ER-positive cases. These results are consistent with prior studies suggesting that the frequency of intratumoral ER heterogeneity ranges from 0.5 % to 10 % [
23‐
26]. It has been hypothesized that some intratumoral heterogeneity could be technical in origin, arising from inadequate sample fixation, and this may contribute to the higher heterogeneity rates reported by some studies. However, differential rates of heterogeneity across different biomarkers in our study and the tiny minority of samples with simultaneous heterogeneity of more than one biomarker suggest that this may be an unlikely explanation for our findings. We also show that inadequate tumor sampling may contribute to biomarker discordance, as tumors with low cellularity were more likely to have discordant ER and PR status between cores. This finding supports our previous research in the AMBER Consortium showing that ER and PR agreement rates between TMAs and the clinical record were reduced in cases with low tumor cellularity [
17]. Our frequency estimate for intratumoral PR heterogeneity (7 % of PR-positive cases) appears lower than that reported previously (approximately 20 % in two studies [
23,
24]). However, one of these prior studies used whole-tissue slides from a consecutive series of patients with breast cancer treated in a tertiary care facility [
23], while the other examined agreement between core needle biopsy and surgical specimens in women presenting with a palpable mass [
24]. As such, in contrast to our present analysis, these prior studies likely overrepresent a more aggressive set of cancers. If heterogeneity is associated with tumor aggressiveness as hypothesized, this could contribute to differences in frequency across studies.
We observed two types of intratumoral HER2 heterogeneity. Cases with equivocal and positive cores formed the majority, comprising 21 % of cases with at least one HER2-positive core, while only 8 % of cases with at least one positive core also had at least one negative core. A prior study reported the presence of both negative and positive HER2 regions in only 1 % of 921 cases [
27], while others reported similar or even lower rates of intratumoral HER2 heterogeneity using IHC analysis [
22,
28]. Researchers in several studies have also reported very low rates of heterogeneity of HER2 amplification status using in situ hybridization techniques [
27‐
29]. However, in these prior studies, researchers reported the frequency of HER2 heterogeneity among all cases, and not just among those with areas of HER2 positivity (defined by the presence of at least one positive core in our study). If we had included all cases in our denominator, only 1 % of all cases would have had both positive and negative HER2 cores, in line with prior studies [
22,
27,
28].
Tumors with spatially distinct areas of high and low biomarker expression levels may suggest a pattern of heterogeneity referred to as
segregated heterogeneity [
30]. Segregated heterogeneity may be particularly clinically relevant because antiestrogen or HER2-directed therapy may apply a selective pressure for outgrowth of areas lacking the molecular target, with consequences for the subtype for subsequent disease recurrence [
31,
32]. Studies of recurrent tumors, particularly those with a subtype distinct from the primary tumor, may be important for understanding the consequences of intratumoral heterogeneity. Similarly, longitudinal studies with quantitative histology and well-characterized spatial biomarker patterns may help improve understanding of the impact of intratumoral heterogeneity on breast cancer outcomes. If intratumoral heterogeneity proves to be a poor prognostic feature as theorized, identification of demographic and tumor characteristics associated with intratumoral heterogeneity could help to identify patients who may benefit from more extensive tumor workup and, potentially, more aggressive therapy. This work is currently underway in the AMBER Consortium.
This study should be considered in light of some limitations. First, the tumor specimens used for clinical workup may have been biopsy specimens or separate blocks from those used to construct central TMAs, and therefore it is possible that the clinical record and the central results represent distinct tumor regions. However, different origins of tumor specimens would be a random source of error, unlikely to bias our findings away from the null. Second, even multiple 1.0-mm TMA cores represent only a small portion of the entire tumor, and therefore it is possible that we underestimated the frequency of intratumoral heterogeneity in the present study. However, our rates of intratumoral heterogeneity are similar to those reported previously. Finally, due to tumor sampling at the time of breast cancer surgery, we were unable to assess temporal intratumoral heterogeneity in this study. Despite the theoretical importance of temporal heterogeneity [
32], spatial heterogeneity at the time of tumor excision is arguably the most relevant for clinical management of breast cancer.
These limitations are balanced by several important strengths. Since automated staining of TMAs is becoming more widely used [
17,
33], we assessed automated evidence of intratumoral heterogeneity (i.e., biomarker discordance between TMA cores), and our results can therefore be used to guide manual review. Our automated image analysis methods are well-validated and produce very high agreement with manual scoring of TMAs in CBCS [
17]. The analysis of the population-based CBCS ensured excellent representation of both African American and non-African American cases in this study, and we were able to infer that race does not strongly influence rates of intratumoral heterogeneity. In addition, procurement of tissue from multiple clinical centers, representing community-based and referral centers, ensured that our study was not biased toward more aggressive cancers commonly seen in referral centers. Given that clinical biomarker status was measured at multiple different laboratories and according to multiple protocols, the substantial rates of agreement between central TMA results and the clinical record provide reassurance that ER, PR, and HER2 staining are well-standardized across clinical care settings.
Abbreviations
AMBER Consortium, African American Breast Cancer Epidemiology and Risk Consortium; CBCS, Carolina Breast Cancer Study; Conc, concordant; DCIS, ductal carcinoma in situ; Disc, discordant; ER, estrogen receptor; H&E, hematoxylin and eosin; HER2, human epidermal growth factor receptor 2; IHC, immunohistochemical; PR, progesterone receptor; TMA, tissue microarray
Acknowledgements
We acknowledge comments from the peer reviewers that helped focus the discussion.