Background
Breast cancer is a heterogeneous disease. Gene expression signatures delineate five major subtypes with distinct pathologies, treatment strategies and clinical outcomes [
1,
2]. The basal-like subtype (BLBC), accounting for ~ 18% of diagnoses, is a subtype with particularly poor prognosis, largely due to the molecular and clinical heterogeneity of these tumours, and the corresponding lack of targeted therapeutics. The molecular drivers of BLBC are poorly understood, thus there are limited available targeted therapies.
One such molecular driver of a subset of BLBCs is mutation in the breast and ovarian cancer susceptibility gene (
BRCA1) [
3,
4]. Mutations in
BRCA1 occur in approximately 0.25% of European women, predisposing them to breast and ovarian cancer, particularly to poor prognosis subtypes including BLBC and high-grade serous ovarian cancer (HGSOC) [
3]. These two subtypes of cancer are similar in terms of their gene expression profiles and genetic dependencies, and thus present similar sensitivity to therapeutic targeting [
5‐
7]. BRCA1 has many cellular functions including transcription and gene splicing, yet is best known for its role in mediating DNA damage repair [
8,
9]. BRCA1 coordinates efficient repair of double stranded DNA breaks through the homologous recombination (HR) pathway [
9]. In the absence of functional BRCA1, cells accumulate mutations and genomic instability and demonstrate an increased frequency of genomic rearrangements [
10].
Another molecular driver of BLBC is the Helix-Loop-Helix (HLH) transcriptional regulator inhibitor of differentiation 4 (ID4). We and others have previously shown ID4 to be important for both mammary gland development and also for the aetiology of BLBC [
11‐
13]. ID4 is overexpressed in a subset of BLBC patients, marking patients with poor survival outcome, and is necessary for the growth of BLBC cell lines [
11,
14‐
21]. Precisely how ID4 mediates this function in BLBC is unclear.
ID proteins (ID1–4) lack a basic DNA binding domain, and thus, their classical mechanism of action is believed to entail dominant-negative regulation of canonical binding partners, basic HLH (bHLH) transcription factors. ID proteins dimerise with bHLH proteins and prevent them from interacting with DNA, affecting the transcription of lineage-specific genes [
22‐
26]. Yet this model of ID protein function is largely based on evidence from studies of ID1–3 in non-transformed fibroblasts, neural and embryonic tissue. ID proteins are tissue specific in their expression and function, and hence, this model may not apply to all four ID proteins across various tissues and in disease [
25,
27‐
32]. Indeed, contrary mechanisms have been described for ID2 in liver regeneration, with ID2 interacting with chromatin at the c-Myc promoter as part of a multi-protein complex to repress c-Myc gene expression [
33,
34], and ID4 has been shown to bind to and suppress activity of the ERα promoter and regions upstream of the ERα and FOXA1 genes in mouse mammary epithelial cells [
12]. These data suggest that despite lacking a DNA binding domain, ID proteins may interact with chromatin complexes under certain conditions. However, no studies have systematically mapped the protein or chromatin interactomes for any ID family member.
To this end, we applied chromatin immunoprecipitation-sequencing (ChIP-seq) to interrogate the ID4-chromatin binding sites and rapid immunoprecipitation and mass spectrometry of endogenous proteins (RIME) to determine the ID4 protein interactome. In addition, ID4 knockdown and RNA-sequencing analysis was used to determine transcriptional targets of ID4. These analyses reveal a novel link between ID4 with the DNA damage repair apparatus.
Methods
Mammalian cell culture growth conditions
Cell lines were obtained from American Type Culture Collection and verified using cell line fingerprinting. HCC70 and HCC1937 cell lines were cultured in RPMI 1640 (Thermo Fisher Scientific) supplemented with 10% (v/v) FBS, 20 mM HEPES (Thermo Fisher Scientific) and 1 mM Sodium Pyruvate (Thermo Fisher Scientific). MDA-MB-468 and OVKATE cell lines were cultured in RPMI 1640 (Thermo Fisher Scientific) supplemented with 10% (v/v) FBS, 20 mM HEPES (Thermo Fisher Scientific) and 0.25% (v/v) insulin (Human) (Clifford Hallam Healthcare). Cells stably transduced with SMARTChoice lentiviral vectors were grown in the presence of 1 μg/mL puromycin.
Imaging
Immunohistochemistry images were obtained using an inverted epifluorescence microscope (Carl Zeiss, ICM-405, Oberkochem, Germany). Images were captured by the Leica DFC280 digital camera system (Leica Microsystems, Wetzlar, Germany). The Leica DM 5500 Microscope with monochrome camera (DFC310Fx) or Leica DMI SP8 Confocal with 4 lasers (405 nm, 488 nm, 552 nm and 638 nm) and two PMT detectors were used to capture standard fluorescent and confocal images.
SMARTChoice inducible lentiviral system
ID4 and control lentiviral shRNA constructs (SMARTchoice) were purchased commercially (Dharmacon, GE, Lafayette, CO, USA). Successfully transfected cells were selected using puromycin resistance (constitutive under the humanEF1a promoter). For ID4 knockdown analysis, HCC70 cells with SMARTChoice shID4 #1 #178657 (VSH6380-220912204), SMARTChoice shID4 #2 #703009 (VSH6380-221436556), SMARTChoice shID4 #3 #703033 (VSH6380-221436580), SMARTChoice shNon-targeting (VSC6572) and mock-infected cells were used. Cells were treated with vehicle control or with 1 μg/mL doxycycline for 72 h before harvesting protein and RNA directly from adherent cells. The SMARTChoice shID4 #2 #703009 (VSH6380-221436556) produced the highest level of ID4 knockdown and was used for further analysis.
Non-lethal DNA damage induction with ionising radiation
Cells were seeded at 2.5 × 105 (HCC70) or 2.2 × 105 (MDA-MB-468) cells/well in a 6-well plate in normal growth medium. One day post seeding, cells were exposed to 2–5 Gy of ionising radiation using an X-RAD 320 Series Biological Irradiator (Precision X-Ray, CT, USA). Cells were returned to normal tissue culture incubation conditions and harvested at designated time points.
Gene expression analysis
Total RNA was prepared for using the miRNeasy RNA extraction kit (Qiagen), according to the manufacturer’s instructions. cDNA was generated from 1000 ng RNA using the Transcriptor First Strand cDNA Synthesis Kit (Roche) using oligo-dT primers and following the manufacturer’s instructions. qPCR analysis was used to analyse mRNA expression levels using Taqman probes (Applied Biosystems/Life Technologies) as per the manufacturer’s specifications (Table
1) using an ABI PRISM 7900 HT machine. qPCR data was analysed using the ΔΔCt method [
35].
Table 1
Genes analysed and the corresponding Taqman assay used to analyse their expression level
ID4 | Hs02912975_g1 |
B2M | Hs99999907_m1 |
GAPDH | Hs02758991_s1 |
NEAT1 | Hs01008264_s1 |
MALAT1 | Hs00273907_s1 |
ELF3 | HS00963881_M1 |
GBA | HS00986836_G1 |
ZFP36L1 | Hs00245183_m1 |
FAIM | HS00216756_M1 |
Protein analysis
Cells were lysed, unless specified, using RIPA [0.88% (w/v) Sodium Chloride, 1% (v/v) Triton X-100, 0.5% (w/v) Sodium Deoxycholate, 0.1% (w/v) SDS, 0.61% (v/v) Tris (Hydroxymethyl) Aminomethane and protease and phosphatase inhibitors (Roche)] or Lysis Buffer 5 (10 mM Tris pH 7.4, 1 mM EDTA, 150 mM NaCl, 1% Triton X-100 and protease and phosphatase inhibitors). If required, protein was quantified using the Pierce BCA Protein Assay Kit (Thermo Fisher Scientific) according to the manufacturer’s instructions. Western blotting analysis was conducted as previously described [
11]. MDC1 protein was analysed using 3–8% tris/acetate gels and PVDF nitrocellulose membrane for MDC1 analysis (BioRad). All other proteins were analysed using the LiCor Odyssey system (Millenium Science, Mulgrave, VIC, Australia). Protein expression was analysed using antibodies targeting ID4 (Biocheck anti-ID4 rabbit monoclonal BCH-9/82-12, 1:40,000), β-Actin (Sigma anti-Actin mouse monoclonal A5441, 1:5000) and MDC1 (Sigma anti-MDC1 mouse monoclonal M2444, 1:1000).
Co-immunoprecipitation
Co-immunoprecipitations (IP) were conducted using 10 μL per IP Pierce Protein A/G magnetic beads (Thermo Fisher Scientific) with 2 μg of antibody: IgG rabbit polyclonal (Santa Cruz sc-2027), ID4 (1:1 mix of rabbit polyclonal antibodies: Santa Cruz L-20: sc-491 and Santa Cruz H-70: sc-13047) and MDC1 (rabbit polyclonal antibody Merck Millipore #ABC155). Beads and antibody were incubated at 4 °C on a rotating platform for a minimum of 4 h. Beads were then washed three times in lysis buffer before cell lysate was added to the tube. Lysates were incubated with beads overnight at 4 °C on a rotating platform. Beads were washed three times in lysis buffer and resuspended in 2× NuPage sample reducing buffer (Life Technologies) and 2× NuPage sample running buffer (Life Technologies) and heated to 85 °C for 10 min. Beads were separated on a magnetic rack, and supernatant was analysed by western blotting as described above.
Rapid immunoprecipitation and mass spectrometry of endogenous proteins (RIME)
Cells were fixed using paraformaldehyde (PFA) (ProSciTech, Townsville, QLD, Australia) and prepared for RIME [
36] and ChIP-seq [
37,
38] as previously described. Cross-linking was performed for 7 min for RIME experiments and 10 min for ChIP-seq, ChIP-exo and ChIP-qPCR experiments. Samples were sonicated using a Bioruptor Standard (Diagenode, Denville, NJ, USA) for 30–35 cycles of 30 s on/30 s off (sonication equipment kindly provided by Prof. Merlin Crossley, UNSW). IP was conducted on 60 (ChIP-seq/ ChIP-exo) to 120 (RIME) million cells using 100 μL beads/20 μg antibody. Correct DNA fragment size of 100–500 bp was determined using 2% agarose gel electrophoresis.
Patient-derived xenograft tumour models were cross-linked at 4 °C for 20 min in a solution of 1% Formaldehyde (ProSciTech), 50 mM Hepes–KOH, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA and protease inhibitors (H. Mohammed, personal communication). Samples (0.5 mg of starting tumour weight) were dissociated using a Polytron PT 1200E tissue homogeniser (VWR) and sonicated using the Branson Digital Sonifer probe sonicator (Branson Ultrasonics, Danbury, CT, USA) with a microtip attachment for 3–4 cycles of 10 × [0.1 s on, 0.9 s off].
Mass spectrometry analysis was conducted at the Australian Proteomic Analysis Facility (APAF) at Macquarie University (NSW, Australia) [
39]. Briefly, samples were denatured in 100 mM triethylammonium bicarbonate and 1% w/v sodium deoxycholate; disulfide bonds were reduced in 10 mM dithiotreitol and alkylated in 20 mM iodo acetamide, and proteins digested on the dynabeads using trypsin. After C18 reversed phase (RP) StageTip sample clean up, peptides were submitted to nano liquid chromatography coupled mass spectrometry (MS) (nanoLC-MS/MS) characterisation. MS was performed using a TripleToF 6600 (SCIEX, MA, USA) coupled to a nanoLC Ultra 2D HPLC with cHipLC system (SCIEX). Peptides were separated using a 15-cm chip column (ChromXP C18, 3 μm, 120 Å) (SCIEX). The mass spectrometer was operated in positive ion mode using a data-dependent acquisition method (DDA) and data-independent acquisition mode (DIA or SWATH) both using a 60-min acetonitrile gradient from 5 to 35%. DDA was performed of the top 20 most intense precursors with charge stages from 2+ to 4+ with a dynamic exclusion of 30 s. SWATH-MS was acquired using 100 variable precursor windows based on the precursor density distribution in data-dependent mode. MS data files were processed using ProteinPilot v.5.0 (SCIEX) to generate mascot generic files. Processed files were searched against the reviewed human SwissProt reference database using the Mascot (Matrix Science, MA, USA) search engine version 2.4.0. Searches were conducted with tryptic specificity, carbamidomethylation of cysteine residues as static modification and the oxidation of methionine residues as a dynamic modification. Using a reversed decoy database, false discovery rate was set to less than 1% and above the Mascot-specific peptide identity threshold. For SWATH-MS processing, ProteinPilot search outputs from DDA runs were used to generate a spectral library for targeted information extraction from SWATH-MS data files using PeakView v2.1 with SWATH MicroApp v2.0 (SCIEX). Protein areas, summed chromatographic area under the curve of peptides with extraction FDR ≤ 1%, were calculated and used to compare protein abundances between bait and control IPs.
Chromatin immunoprecipitation-quantitative real-time PCR analysis
Chromatin immunoprecipitation (ChIP) was conducted as described previously [
37]; however, following overnight IP, the samples were processed using a previously described protocol [
40].
DNA was purified then quantified using quantitative real-time PCR analysis. Control regions analysed and primers used are listed in Table S
4.
Relative enrichment of each region/primer set was calculated by taking an average of each duplicate reaction. The input Ct value was subtracted from the sample Ct value and the Ct converted using the respective PE for each primer set. The relative ChIP enrichment is then calculated by dividing the gene region of interest by the specific control region that is negative for both ID4 and H3K4Me3 binding (IFF01/NOP2 #1 primer). The formula for this normalisation is below:
$$ \Phi \mathrm{Ct}={\mathrm{Ct}}_{\mathrm{region}\ \mathrm{of}\ \mathrm{interest}}-{\mathrm{Ct}}_{\mathrm{input}\ \mathrm{region}} $$
$$ \mathrm{ChIP}\ \mathrm{enrichment}={\mathrm{PE}}^{\left[-\Phi \mathrm{Ct}\left(\mathrm{region}\ \mathrm{of}\ \mathrm{interest}\right)\right]}-{\mathrm{PE}}^{\left[-\Phi \mathrm{Ct}\left(\mathrm{IFF}01\right)\right]} $$
A sample was considered to be enriched if the fold-change over IgG control for each region was > 2.
Chromatin immunoprecipitation-sequencing
Chromatin immunoprecipitation and sequencing (ChIP-seq) was conducted as previously described [
37]. Samples were prepared and sequenced at Cancer Research United Kingdom (CRUK), Cambridge, UK. Antibody conditions for ChIP are the same as those used for RIME, with the addition of antibodies targeting H3K4Me3 (Active Motif #39159) and γH2AX (Ser139) (1:1 mix of Cell Signalling #2577 and Merck Millipore clone JBW301). Samples were sequenced at CRUK using an Illumina HiSeq 2500 single-end 50-bp sequencing. Quality control was conducted using FastQC [
41] and sequencing adapters trimmed using cutadapt [
42]. Reads were aligned using Bowtie for Illumina v0.12.7 [
43] followed by Sam-to-Bam conversion tool [
44] and alignment using Bwa v0.705a [
44]. Alignment statistics were generated using samtools flagstat [version 0.1.18 (r982:295)] [
44]. ChIP-seq peaks were called using the peak calling algorithm HOMER v4.0 and MACS v1.4.2 [
45,
46].
Chromatin immunoprecipitation-exonuclease sequencing (ChIP-exo) was conducted as previously described [
38]. Samples were prepared and sequenced at CRUK.
qPCR analysis of ChIP DNA
Publicly available H4K3Me3 ChIP-sequencing data and the ID4 ChIP-sequencing data generated in this project were visualised using UCSC Genome Browser (
genome.ucsc.edu and [
47]). Regions of positive and negative enrichment were selected and the 500–1000-bp DNA sequence was imported into Primer3, a primer design interface, web version 4.0.0 [
48]. Primers were designed with a minimum primer amplicon length of 70 bp. Primers were confirmed to align with specific DNA segments by conducting an in silico PCR using UCSC Genome Browser (
genome.ucsc.edu and [
47]). Oligo primers were ordered from Integrated DNA Technologies (Singapore). Primers were tested to determine adequate primer efficiency (between 1.7 and 2.3). All assays were set up using an EPmotion 5070 robot (Eppendorf, AG, Germany) and run on an ABI PRISM 7900 HT machine (Life Technologies, Scoresby, VIC, Australia). Briefly, reactions were performed in triplicate in a 384-well plate. Each reaction consisted of 1 μL 5 μM Forward primer, 1 μL 5 μM Reverse primer, 5 μL SYBR Green PCR Mastermix (Thermo Fisher) and 3 μL DNA. A standard curve was created using unsonicated, purified DNA extracted from the HCC70 cell line in 10-fold dilutions (1, 0.1, 0.01, 0.001, 0.0001).
PCR cycling was as follows: 1 cycle at 50 °C for 2 min, 1 cycle at 95 °C for 10 min, followed by 40 cycles of 95 °C for 15 s and 60 °C for 1 min. A dissociation step was conducted at 95 °C for 15 s and 60 °C for 15 s. Data was analysed and a standard curve created using SDS 2.3 software (Applied Biosystems). The slope was used to calculate the PE using the qPCR Primer Efficiency Calculator (Thermo Fisher Scientific, available at
thermofisher.com).
Patient-derived xenograft and histology
All experiments involving mice were performed in accordance with the regulations of the Garvan Institute Animal Ethics Committee. NOD.CB17-Prkdc
scid/Arc mice were sourced from the Australian BioResources Ltd. (Moss Vale, NSW, Australia). Assoc. Prof Alana Welm (Oklahoma Medical Research Foundation) kindly donated the patient-derived xenografts (PDX) models used in this study. Models were maintained as described elsewhere [
49]. Tumour chunks were transplanted into the 4th mammary gland of 5-week-old recipient NOD.CB17-Prkdc
scid/Arc mice. Tumours were harvested at ethical endpoint, defined as having a tumour approximately 1 mm
3 in size or deterioration of the body condition score. At harvest, a cross-section sample of the tumour was fixed in 10% neutral buffered formalin (Australian Biostain, Traralgon, VIC, Australia) overnight before transfer to 70% ethanol for storage at 4 °C before histopathological analysis. The formalin fixed paraffin embedded (FFPE) blocks were cut in 4-μm-thick sections and stained for ID4 (Biocheck BCH-9/82-12, 1:1000 for 60 min following antigen retrieval using pressure cooker 1699 for 1 min, Envision Rabbit secondary for 30 min). Protein expression was scored by a pathologist using the H-score method [
50].
Fluorescent in situ hybridisation
Tissue sections were analysed using fluorescent in situ hybridisation (FISH) to examine the genomic region encoding ID4 (6p22.3). ID4 FISH Probe (Orange 552–576 nm, Empire Genomics, NY, USA) was compared to the control probe CEP6 (Chromosome 6, Green 5-Fluorescein dUTP). This CEP6 probe marks a control region on the same chromosome as ID4 and is used to normalise ID4 copy number. Breast pathologist Dr. Sandra O’Toole oversaw the FISH quantification for all samples.
Immunofluorescence and proximity ligation assays
Immunofluorescence
Cells were seeded on glass coverslips (Coverglass, 13 mm, VITLAB, Germany). At harvest, media was removed, and cells were washed twice with PBS without salts and fixed in 4% paraformaldehyde (PFA) (ProSciTech) for 10 min. Cells were again washed twice with PBS without salts (Thermo Fisher Scientific) before permeabilising for 15 min with 1% Triton-X (Sigma-Aldrich) in PBS and then blocking with 5% BSA in PBS without salts for 1 h at room temperature. Cells were washed twice with PBS without salts, and antibodies were applied overnight at 4 °C: ID4 (Biocheck BCH-9/82-12, 1:1000), MDC1 (Sigma-Aldrich M2444, 1:1000), BRCA1 (Merck Millipore (Ab-1), MS110, 1:250), γH2AX (Ser139) (Merck Millipore clone JBW301 05-636, 1:300), FLAG (Sigma-Aldrich M2, 1:500) and V5 (Santa Cruz sc-58052, 1:500). Cells were washed twice with PBS without salts; then, secondary antibodies were applied for 1 h at room temperature. Cells were washed twice with PBS without salts, with the second wash containing DAPI (1:500 dilution) and phalloidin (1:1000 dilution) (CytoPainter Phalloidin-iFluor 633 Reagent Abcam ab176758). Cells were then mounted on slides using 4 μL of Prolong Diamond (Thermo Fisher Scientific).
Duolink proximity ligation assay analysis (PLA)
PLA was conducted using Duolink PLA technology with Orange mouse/rabbit probes (Sigma-Aldrich, DUO92102) according to the manufacturer’s instructions. Images were captured using SP8 6000 confocal imaging with 0.4um Z-stacks. Maximum projects were made for each image (100–200 cells) and quantified using FIJI by ImageJ [
51] as described previously [
52]. Quantification was conducted on a minimum of 50 cells. Data is represented as number of interactions (dots) per cell.
Quantification of DNA damage foci
Image quantification was conducted using FIJI v2.0.0 image processing software (Fiji is just ImageJ, available at Fiji.sc, [
51]) as previously described [
52]. Four to five images were taken of each sample. The DAPI channel was supervised to enable accurate gating of cell nuclei for application to other channels. Size selection (pixel size 2000 to 15,000) and circularity (0.30–1.00) cut-offs were used. Cells on the edge of the image were excluded from the analysis. The number of DNA damage foci per cell nucleus was calculated for approximately 100–200 cells. The information for individual samples was then collated and analysed using the Pandas package in Python 3.5.
Clinical cohorts
Basal-like breast cancer
Samples were stratified into groups as follows: 42 BLBC (negative for ERα, PR, HER2 and positive for CK5/6, CK14 or EGFR), 14 triple-negative non-BLBC (negative for ERα, PR, HER2, CK5/6, CK14 and EGFR) and 26 HER2-Enriched (negative for ERα and PR, positive for HER2). BRCA1-mutation status in this cohort is unknown; however, it is expected to occur in approximately 6.5% of BLBC patients [
53]. Samples were obtained under the Garvan Institute ethical approval number HREC 08/145.
Kathleen Cuningham Foundation Consortium for research into familial breast cancer (KConfab)
BRCA1-mutant BLBC was sourced from KConfab. A total of 97 BRCA1-mutant BLBC cases were obtained under the Garvan Institute ethical approval number HREC 08/145.
Ovarian cancer
A total of 97 HGSOC cases were obtained under the Human Research Ethics Committee of the Sydney South East Area Hospital Service Northern Section (00/115) [
54].
Discussion
By elucidating the drivers and dependencies of BLBC, we aim to improve our understanding of this complex, heterogeneous disease, potentially leading to the identification of novel targets and therapeutics. Previous work from our group and others has shown that ID4 acts as a proto-oncogene in BLBC [
11,
14‐
21]. ID4-positive BLBC have a very poor prognosis, and depletion of ID4 reduced BLBC cell line growth in vitro and in vivo [
11], suggesting that ID4 controls essential, yet unknown intrinsic pathways in BLBC.
We have conducted the first systematic mapping of the chromatin interactome of ID4 in mammalian cells. Using ChIP-seq, we have identified novel ID4 binding sites within the BLBC genome. ID4 bound to large regions of chromatin, up to 10 kb in length, at a very small number of loci, suggesting that ID4 is not binding as part of a conventional transcriptional regulatory complex. This conclusion is supported by the observation that ID4 knockdown did not affect gene expression at these loci. The regions identified through ChIP-seq were typically the bodies of genes that are highly transcribed and mutated in cancer [
70,
71], characteristics of fragile, DNA damage-prone sites. Interestingly, these sites primarily encode non-coding RNA, including lncRNA, microRNA precursors and tRNA. The lncRNA NEAT1 and MALAT1 are some of the most abundant cellular RNAs and the genes encoding them undergo recurrent mutation in breast cancer [
72]. Furthermore, they are upregulated in ovarian tumour cells and are associated with higher tumour grade and stage and in metastases [
73]. The genomic loci encoding tRNAs are also highly transcriptionally active, co-localising with DNaseI hypersensitivity clusters and transcription factor binding sites, including BRCA1 and POLR2A [
74]. Transcriptional activity is highly stressful and associated with genomic stress and DNA damage [
70]. Binding of ID4 to these sites was increased upon DNA damage. Together, these data suggest that ID4 binds preferentially to sites of active transcription and DNA damage, consistent with its interaction with MDC1.
We have conducted the most systematic and unbiased proteomic analysis of binding partners for any ID family member. RIME revealed hundreds of ID4 interacting proteins, which were highly enriched for BRCA1-associated proteins. Five novel proteins were found to interact with ID4 with high confidence in all 4 models examined, namely MDC1, ADAM9, HRG, SF3A2 and SYNE3. These warrant further investigation to examine the role they may have in the mechanism of action of ID4. Interestingly, no bHLH proteins were reproducibly found in complex with ID4, although they are the canonical ID binding partners in certain non-transformed cells [
22‐
26]. This is unlikely to be a technical artefact, as ID4-bHLH interactions are readily detected in non-transformed mammary epithelial cells using the same method (H. Holliday et al; Unpublished data). Rather, ID4 may have alternate binding partners as a consequence of downregulation of bHLH proteins in BLBC (H. Holliday et al; Unpublished data).
ID4 has been demonstrated to be a member of a ribonucleoprotein complex along with mutant p53, SRSF1 and lncRNA MALAT1 [
66]. This complex promotes splicing of VEGFA pre-mRNA, which signals in a paracrine manner to macrophages and ultimately results in tumour angiogenesis [
65,
66,
75]. p53 and SRSF1 were not identified in our RIME analysis. This disparity may be the consequence of methodological differences such as fixation, cell lysis conditions and detection techniques. It is possible that ID4 is a member of a large complex encompassing several splicing factors (including SF3A1 and SRSF1). The observations that ID4 interacts with both the MALAT1 gene, as well as MALAT1 lncRNA itself [
66], suggests that ID4 and MALAT1 function are intimately linked and warrants further investigation.
MDC1, the most reproducible binding partner of ID4 identified through RIME analysis, is recruited to sites of DNA damage to amplify the phosphorylation of H2AX (to form γH2AX) and recruit downstream signalling proteins [
62]. Deficiency in MDC1, much like BRCA1, results in hypersensitivity to double-stranded DNA breaks [
76]. MDC1 interacted with many of the sites of chromatin interaction by ID4 in a DNA damage-dependent manner. We also find ID4 in close proximity of known MDC1 interactors, including BRCA1 and γH2AX. These data suggest a model in which ID4 associates with the DNA damage repair apparatus at sites of genome instability or damage via its interaction with MDC1. MDC1 was also recently found to associate with ID3, suggesting that this is a conserved feature of ID proteins [
77]. How MDC1 binds ID4 is unknown; however, a quasi HLH domain [
78] structure within MDC1 may enable interaction with the HLH domain of ID4. As complex feedback mechanisms govern the DNA damage response, further investigation is required to determine whether ID4 association with MDC1 ultimately promotes or impedes DNA repair.
BRCA1 has proposed functions as a transcription factor controlling differentiation in non-transformed cells [
74,
79], but a primary role in DNA repair in cancer [
9]. Similarly, ID4 is an important regulator of stemness in the developing mammary gland, acting to inhibit differentiation [
11,
12]. However, the results presented here have uncovered an unexpected role for ID4 in the DNA damage response in BLBC, suggesting a similar dichotomy of function to BRCA1, that is, primarily regulating transcription during development whilst predominantly regulating the DNA damage response in cancer. Transcription is a stressful cellular process causing significant DNA damage and repair [
70]. Thus, a role for transcription factors in localising DNA damage machinery to chromatin may be an important cellular capability.
Mutations in BRCA1 predispose carriers predominantly to cancers of the breast and ovaries (mostly BLBC and HGSOC), though the mechanism driving tumorigenesis in these patients is still unclear. While many BRCA1-associated cancers undergo LOH for the wildtype BRCA1 allele, many acquire other genomic ‘hits’ prior to this event, which may be required for subsequent LOH [
80]. In addition to reporting a biochemical interaction between ID4 and MDC1, we also show a novel genetic interaction with BRCA1, in that
ID4 is amplified at twice the frequency in
BRCA1-mutant BLBC compared to sporadic BLBC, making it one of the most frequently amplified genes in that disease. A caveat of this finding is that other cancer-associated genes, such as
E2F3, are located adjacent to
ID4 at Chr6q22 and so may also be a target of the amplification event.
Further work is required to understand the drivers of ID4 amplification and its contribution to DNA repair; however, at least 2 scenarios are possible. In the first, ID4 acts to suppress DNA repair proficiency and so cooperates with
BRCA1 haploinsufficiency to promote genomic instability and tumourigenesis. This is consistent with the positive correlation between ID4 expression and ‘BRCAness’ [
15], a defect in BRCA1 function in the absence of germline
BRCA1 mutations [
81]. However, perhaps more likely is the opposing scenario, that
ID4 amplification and overexpression promote DNA damage repair, consistent with our observations of ID4 association with MDC1 at fragile sites and the ongoing requirement for ID4 in BLBC proliferation that we previously reported [
11]. In the case of BRCA1-associated breast cancer,
ID4 amplification, like
TP53 mutation, may be permissive for subsequent
BRCA1 LOH which is otherwise lethal [
69], explaining the high frequency of
ID4 amplification in familial cancers. Further resolving the function of ID4 in BLBC will require detailed biochemical analysis of DNA repair functional assays, genetic studies with ID4 knockout cells or animals and access to a large cohort of familial breast cancers with detailed gene copy number, BRCA1 mutation and methylation data.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.