Background
Breast cancer (BC) incidence is increasing in women under age 40 years in the US [
1] and is increasingly common worldwide in women under age 50 years [
2,
3]. Decline in the age of breast development [
4] may account for some of the change. Age of menarche, a long-established risk factor for breast cancer, has been relatively stable in recent decades [
5]. As the interval between early breast development and the age at menarche (referred to as pubertal tempo) when the breast may be more susceptible to carcinogens has widened, it is essential to have other measures of pubertal development [
6]. Height, age at breast development, age at menarche, and increased tempo were each independently associated with an increase in BC risk in a large prospective cohort study [
7]. Compared with height and age at menarche, age at breast development has been more challenging to determine.
Breast development is often assessed using Tanner stages (TS), which is routinely used in clinical evaluation. TS range from TS1 to TS5, and are separately evaluated for breast and pubic hair. We focus this paper on breast TS with TS1 referring to no breast development, TS2 as the first appearance of breast buds, TS3 where the areola and breast are larger than just buds but the areola does not stick out away from the breast, TS4 where the nipple is raised above the breast, and TS5 the mature breast. Tanner stage is generally assessed by a clinician using visual inspection followed with palpation, but can also be evaluated by self-reporting or maternal reporting using drawings of TS with explanatory text [
8]. TS reporting by parents or self-reporting has been less reliable and valid compared with clinician reports, with parents more accurate reporters of TS in children before age 11 years and children more accurate reporters after age 11 [
9].
Breast development can also be tracked through imaging methods, although most imaging methods such as dual-energy x-ray absorptiometry, magnetic resonance imaging or mammography are either too expensive to use routinely in young girls and/or involve exposing the breast to ionizing radiation. Breast tissue composition is associated with mammographic breast density (MBD), which represents the connective and glandular versus the adipose tissue fraction [
10‐
12]. The tissue components giving rise to MBD have distinct optical absorption spectra, which led to the development of optical spectroscopy (OS) methods to examine breast tissue composition using visible and near infrared light. OS has been shown to identify women of mammographic screening age having >75% MBD [
13] and who are at elevated risk of BC, with sensitivity and specificity >0.9 [
14,
15]. Studies in younger women (31–40 years of age) showed strong associations with parity [
16], another well-established BC risk factor. Here we present an extension of the OS technique adapted for the developing breast of girls ages ≥10 years, to demonstrate the utility of this method to detect breast development TS, adjusting for age, BMI, and breast cancer risk score (BCRS). We further examined whether BCRS was associated with OS components.
Discussion
Using PCA of visible and near-infra-red (NIR) spectra from breast tissue, we were able to capture over 99% of the variation in breast tissue optical properties through eight PCs. Unlike the linear increase with age and BMI, OS components had distinct patterns by TS suggesting that OS can be used to objectively identify breast TS.
During early-stage breast development, the majority of the optical information pertains to the skin, subcutaneous tissue including the adipose tissue and the pectoral muscle, whereas for the later TS the optical signal of the pectoral muscle is replaced by the actual breast tissue. The PC scores that are correlated with each stage are sufficient to capture the changing ratios of muscle to adipose to glandular tissue within the optically sampled volume in girls’ chests during puberty.
Spectroscopically, the most striking features in the PC spectra are the strong peaks at 930 nm and 970 nm representing lipid and water absorption, respectively. These peaks both appear inversely in PC1 and are visible in PC2, PC3, PC5, and PC6, and are not statistically significant, reflecting a change in the adipose (lipid) and proliferating glandular (water) tissue. While the spectral components of the main tissue chromophores are overlapping (see Additional file
1: Figure S1C), the short wavelength range is dominated by the hemoglobins, whereas the long wavelength range is affected by collagen [
26].
The current PCA analysis, while being somewhat difficult to visualize, nevertheless provides strong evidence of the ability to stage breast development in an objective manner. Each of the current PCs carries information on the various tissue chromophores as shown in Additional file
2: Table S3. The final separation of the chromophores requires significant additional computation. As Additional file
2: Table S3 illustrates, the separate PCs are related to a set of chromophores but it is the direction of these relationships and the strengths of these associations that change as the breast develops. In Additional file
2: Table S3, we show the correlation and the
P values for PC1–8 and each chromophore. PC1, which accounts for the greatest variation, is dominated by the overall attenuation rather than the contributions of specific chromophores. The other components, however, reveal how there is additional adipose and dense tissue as the breast develops, that the ratio between the two changes, and that there is less signal from the pectoral muscle.
For example, PC2 scores are related to the amount of dense tissue which increases as the breast matures from TS2 to TS4. For transition from TS1 to TS2, which is the onset of breast development, PC3 scores become positive and remain positive through TS4, signaling an increase in lipids or adipose tissue as the breast develops. Thus, the onset of breast development is marked by an increase in adipose tissue. In addition, the PC3 scores have a large negative component at shorter wavelengths, indicating a reduction in hemoglobin and/or myoglobin within the optical measured tissue volume, indicating breast tissue with lower relative blood volume and less contribution from the pectoral muscle compared to TS1 (see Additional file
2: Table S3). The increased relative absorption by lipids at the expense of water and hence glandular tissue is also present, as shown by the declining contribution of the PC2 scores. Transition to T3 was also marked by an increase in PC6 scores, reflecting additional lipid content and an increase in PC7 scores reflecting lower collagen.
Interestingly, although PC4 and PC5 scores did not map clearly to TS they were different by BCRS. As Additional file
2: Table S3 reveals, high PC4 scores indicate increased collagen in the optically measured tissue volume and decreased hemoglobin content and oxygenation and high PC5 scores indicate less lipid.
We identified OS-derived principal components (PC2, PC3, PC6, and PC 7) that mapped to breast developmental stage. In particular, the complementarity of spectral features in PC2 and PC6 and the unique short wavelength absorption in PC3 are sufficient to capture the changing ratio of muscle to adipose to glandular tissue in girls’ chests during puberty, as noted by the multivariate regression results (Tables
3,
4, and
5) and the variable importance random forest plots (Additional file
1: Figures S4A-B). Thus, this preliminary study suggests that OS-derived measures have the potential to predict breast developmental stage in preteen and teen girls.
Furthermore, three OS-derived principal components (PC4, PC5, and PC8 scores) together best predicted BCRS. The PC4 and PC8 scores correlated negatively and significantly with BCRS indicating that those with higher scores in these variables tend to come from BCFH- families. The PC5 scores positively correlated with BCRS implying that those with higher scores in these variables tend to come from BCFH+ families. It is of interest that the lipid-water ratio, previously identified as a breast cancer risk factor in adult women is not prominent in these spectra, but there is strong absorption at the short wavelengths and long wavelengths beyond 970 nm; this suggests that the relative hemoglobin and collagen contributions may play a role in BCFH status.
Acknowledgements
The authors thank the LEGACY girls and family members for continuing contributions to the study, and our colleagues at the participating clinics. We also acknowledge the diligent work of Brenda Ornelas, Jennifer Xanthopoulos, Victoria Kuta, Jennifer Batchelor, Rohini Gosai, Pauline Susanto, and Nayana Weerasooriya, who assisted in data collection. We thank the contributing clinical centers (Clinical Genetics at Trillium Health Partners - Credit Valley Hospital, Cancer Risk Assessment Centre at the Juravinski Cancer Centre, Princess Margaret Hospital Familial Breast and Ovarian Cancer Clinic, Mount Sinai Familial Breast Cancer Clinic, and Granovsky Gluskin Family Medicine Centre of Mount Sinai Hospital.