Introduction
Breast cancer is the second most common malignancy in the world to date [
1]. Classification of this cancer is based on a number of aspects such as tumour progression and pathology, estrogen receptor status and Human Epidermal growth factor Receptor 2 (HER2) status. All of these clinical parameters dictate the most suitable patient treatment.
HER2-positive breast cancer, in which the HER2 receptor is either overexpressed or amplified, is represented in approximately 20–30% of human breast cancers [
2] and has been associated with poorer prognosis [
3,
4]. As with many cancers, there are a number of treatment options available to treat HER2 positive breast cancer. Radiation, surgery and chemotherapy have long been the standard for treatment. However, in recent years a more targeted approach has been taken in regards to treatment. Current targeted therapies available for this breast cancer subtype include the monoclonal antibody trastuzumab and the dual tyrosine kinase inhibitor lapatinib. The adverse effects associated with these types of therapies are less severe than those of traditional chemotherapies as they target cancer cells more specifically [
5]. Tyrosine kinases are a group of enzymes that play a critical role in the signalling cascades of the cell. The tyrosine kinase functionality of these enzymes is typically coupled to and moderated by ligand binding (receptor) components and receptor-coupled tyrosine kinases are involved in the phosphorylation of tyrosine receptors in targeted proteins. Many important receptor-coupled tyrosine kinases are located in the cell membrane and proteins are activated by the binding of ligands to their extracellular domain. HER2 and EGFR (epidermal growth factor receptor) are two such examples of growth factor receptors which can homodimerise or dimerise with other members of the Human Epidermal Growth Factor Receptor family, which in turn activates their tyrosine kinase moiety. The activated tyrosine kinases have critical roles in cell signalling processes such as cell proliferation and growth [
6,
7]. Tyrosine kinase inhibitors (TKIs) prevent the activation of these tyrosine kinases thus inhibiting the activation of the pathways that would promote tumour cell growth and proliferation.
In this study, we focused on lapatinib, a dual kinase inhibitor developed by GlaxoSmithKline, which targets both HER2 and EGFR [
8]. By binding to both HER2 and EGFR receptors, lapatinib prevents activation of important pro-cancer pathways such as Erk/MAPK (extracellular-signal-regulated kinase/mitogen-activated protein kinase) and PI3K (Phosphatidylinositol 3-kinases) which have vital roles in cell proliferation and survival [
8,
9]. Lapatinib is currently approved for treatment of metastatic breast cancer in combination with capecitabine [
10]. It has also been used in combination with trastuzumab in patients suffering from advanced HER2 positive breast cancer [
11].
Despite the wide application of HER2 testing in breast cancer, a significant proportion of HER2-positive patients do not respond to HER2-targeted therapy. In recent studies performed using lapatinib as a monotherapy, in combination with capecitabine and also with trastuzumab, clinical benefit response rates were found to range from 12.4% with lapatinib alone, 22% in combination with capecitabine and 24.7% in combination with trastuzumab [
10,
12,
13].We have therefore sought to use cellular models to examine and identify the gene expression changes which might be characteristic of response to treatment with lapatinib.
In this paper, we used a multivariate statistical technique called co-inertia analysis (CIA) to link transcription factor binding site (TFBS) target predictions and gene expression data to identify transcription factors (TFs) associated with the cellular response to lapatinib [
14,
15]. This is the first time this data integration technique has been applied to a data set of breast cancer cells responding to drug treatment. The TFBS target predictions have been previously published [
14]. In total this analysis contained TFBS information for 1236 known and predicted TFBSs across the conserved proximal promoters for ~17,000 genes. The gene expression dataset has been described previously [
16] and incorporates time series data post treatment with high and low dose lapatinib in BT474 and SKBR3 cell lines.
From the original analysis [
16] of this time series data, a number of gene expression changes were identified following treatment with lapatinib. These included a number of differentially expressed genes associated with the AKT pathway. This pathway is highly associated with cell proliferation, apoptosis and cell migration. The differentially regulated genes included, FOXO3A, CDKN1B, CCND1, AKT1 and E2F3. Of these genes, the authors focused on the expression of FOXO3A and some of its associated targets and regulators such as CDKN1B and CCND1 [
16].
CIA is used to combine two linked datasets (two sets of measurements on the same objects) and perform two simultaneous non-symmetric correspondence analyses (NSC) and identify the axes that are maximally co-variant [
15,
17]. The use of an ordination method such as NSC or principle components analysis (PCA) allows us to summarise the data in a low dimensional space. In this case, the two linked datasets are normalised gene expression data from the lapatinib-treated cell lines and TFBS information for the same genes. We have previously used this method to compare gene expression data with miRNA target information [
18] and proteomics data [
19]. This is the first time that this approach has been used to analyse data derived from breast cancer cells responding to targeted therapy treatment.
CIA allows us to identify commonality between the expression of the genes and the TFs that are predicted to target these genes. It can be performed both unsupervised and supervised. The unsupervised step allows for data exploration and the identification of interesting trends or splits in the data and the supervised step allows us to identify which TFs are responsible for these splits. The supervised step incorporates the between group analysis (BGA) classification method [
20,
21] which is used in combination with the ordination method, forcing the ordination to be carried out on groups of samples rather than individual samples. First, a normal NSC is performed; BGA then finds the linear combination of the NSC axes that maximizes between-group variance and minimizes within-group variance, for specified groups. The output from this analysis is a ranked list of TFs predicted to be associated with the cellular response to lapatinib.
Using this approach, we were able to identify 8 TFs associated with the cellular response to lapatinib. This information was then used to generate a shortlist of 19 genes based on; the magnitude of their response to lapatinib, whether they were predicted targets of the 8 TFs and the involvement of the gene in important oncogenic processes. Genes were manually selected on the basis of meeting two or more of these criteria and as representatives to validate the typically less quantitative array data analyses. This cohort of 27 genes was examined using Taqman RT-PCR in a panel of 6 cell lines that had varying sensitivities to lapatinib. 5 genes were significantly differentially expressed across all 6 cell lines (RB1CC1, FOXO3A, NR3C1, ERBB3 and CCND1) and the expression of these 5 genes was directly correlated with the degree of sensitivity of each cell line to lapatinib.
Materials and methods
Gene expression data
The lapatinib-treated cell line dataset and experimental design has been described previously [
16] and was obtained from the corresponding author in the form of raw data files (.cel files). The normalised data file can be downloaded from
http://www.ebi.ac.uk/arrayexpress (accession number: E-MEXP-440). Gene expression values were called using the robust multichip average method [
22] and data were quantile normalized using the Bioconductor package, affy. Affymetrix human genome HG-U133A arrays containing >22,000 probesets were used in this experiment. Briefly, the experimental design was as follows; four cell lines (BT474, SKBR3, T47D and MDAMB468) were analysed at 2, 6 and 12 hours post treatment with 0.1% DMSO (the control), 0.1 μM lapatinib and 1.0 μM lapatinib, with four replicates for each time point/treatment. In addition, 0.1% DMSO-treated cells were arrayed at 0 and 24 hours and 0.1 μM lapatinib treated cells were arrayed at 24 hours. Again these were arrayed in quadruplicate. In total, there were 48 arrays for each cell line. Our analysis focused on the two lapatinib sensitive cell lines, BT474 and SKBR3, comprising a total of 96 arrays (including controls).
Differential gene expression lists were generated using the ebayes function of the limma [
23] package from Bioconductor. A fold change of ≥ 1.3 and an adjusted p-value of ≤ 0.05 were considered significant. The p-values are adjusted using the Benjamini and Hochberg method [
24]. The choices of comparisons within the datasets were guided by the unsupervised CIA. In total there were 6 comparisons and these are summarised in Table
1. The final gene list was determined by consistent overlap between these 6 comparisons.
Table 1
A breakdown of the 6 comparisons for BT474 and SKBR3
1 | BT474 | Group 1 | 1 μM lapatinib | 6 hr & 12 hr | 8 |
Group 2 | 0.1 μM lapatinib | 2 hr & 6 hr & 12 hr & 24 hr | 16 |
1 μM lapatinib | 2 hr | 4 |
0.1% DMSO | 0 hr & 2 hr & 6 hr & 12 hr & 24 hr | 20 |
Total
| | |
48
|
2 | BT474 | Group 1 | 1 μM lapatinib | 6 hr & 12 hr | 8 |
Group 2 | 0.1 μM lapatinib | 6 hr & 12 hr | 8 |
Total
| | |
16
|
3 | SKBR3 | Group 1 | 1 μM lapatinib | 6 hr & 12 hr | 8 |
0.1 μM lapatinib | 6 hr & 12 hr | 8 |
Group 2 | 0.1 μM lapatinib | 2 hr & 24 hr | 8 |
1 μM lapatinib | 2 hr | 4 |
0.1% DMSO | 0 hr & 2 hr & 6 hr & 12 hr & 24 hr | 20 |
Total
| | |
48
|
4 | SKBR3 | Group 1 | 1 μM lapatinib | 6 hr & 12 hr | 8 |
0.1 μM lapatinib | 6 hr & 12 hr | 8 |
Group 2 | 0.1 μM lapatinib | 2 hr | 4 |
1 μM lapatinib | 2 hr | 4 |
0.1% DMSO | 0 hr & 2 hr & 6 hr & 12 hr & 24 hr | 20 |
Total
| | |
44
|
5 | SKBR3 | Group 1 | 1 μM lapatinib | 12 hr | 4 |
0.1 μM lapatinib | 12 hr | 4 |
Group 2 | 0.1 μM lapatinib | 2 hr & 6 hr & 24 hr | 12 |
| 1 μM lapatinib | 2 hr & 6 hr | 8 |
| 0.1% DMSO | 0 hr & 2 hr & 6 hr & 12 hr & 24 hr | 20 |
Total
| | |
48
|
6 | SKBR3 | Group 1 | 1 μM lapatinib | 12 hr | 4 |
| | | 0.1 μM lapatinib | 12 hr | 4 |
| | Group 2 | 0.1 μM lapatinib | 2 hr & 6 hr | 8 |
| | | 1 μM lapatinib | 2 hr & 6 hr | 8 |
| | | 0.1% DMSO | 0 hr & 2 hr & 6 hr & 12 hr & 24 hr | 20 |
| |
Total
| | |
44
|
The validity of choosing these six comparisons was confirmed by differentially expression analysis to show that early response in both BT474 and SKBR3 cells and low dose lapatinib in BT474 cells results in little or no lapatinib responsive genes. As above the Bioconductor package, Limma was used, and a fold change of ≥ 1.3 and an adjusted p-value of ≤ 0.05 were considered significant.
Co-inertia analysis
CIA, a multivariate coupling technique, was used in an unsupervised manner to combine the two linked datasets; gene expression data from lapatinib-treated BT474 and SKBR3 cell lines and predicted TFBS information for the same genes. This initial step was used for data exploration and uses NSC. The analysis was then rerun in a supervised manner using BGA [
14]. The output from this analysis is a ranked list of TFs predicted to be associated with the cellular response to lapatinib. The same 6 comparisons used to generate the differentially expressed gene list were used to generate 6 ranked lists of TFs. The final TF list was determined by overlap between these 6 ranked lists. All calculations were carried out using the MADE4 library [
25] of the open source R package. MADE4 can be downloaded freely from the Bioconductor web site
http://www.bioconductor.org. All the scripts and datasets used are available upon request from the authors.
Transcription factor binding site information
The TFBS data has been previously published and contains information for 1236 known and predicted TFBSs across the conserved proximal promoters for ~17,000 genes at four different position specific scoring matrix (PSSM) thresholds, 0.7, 0.75, 0.8 and 0.85, giving 4 gene/TFBS frequency tables [
14]. Using BGA with CIA, we were able to combine this information with gene expression data to gives 4 ranked lists of TFBS associated with a particular split of interest within the data; in this case, TFBS associated with the cellular response to lapatinib, for which we can infer the TFs linked with this response. The four lists were combined using the Rank Products method [
26] which was initially developed for combining lists of differentially expressed genes. This gives one final list of ranked TFs.
Statistical overrepresentation of TFBS
The TFs identified from the supervised CIA were validated using statistical overrepresentation of their predicted target genes within the differentially expressed gene list. A one-tailed fisher exact test was used as we are specifically interest in overrepresentation only [
27,
28]. The 421 consistently differentially expressed genes and the 8252 genes for which promoter information was available and were present on the U133A arrays, acted as the foreground and background for the fisher exact test respectively. The TFBS information is described in the previous section.
Cell culture
SKBR3, HCC1954, EFM192A, MDAMB453 and MDAMB231 breast cancer cell lines were maintained in RPMI 1640 medium supplemented with 10% fetal bovine serum (PAA Labs, Austria). BT474 cells were maintained in Dulbeccos Modified Eagles medium (DMEM) supplemented with 10% fetal bovine serum, 2% L-Glutamine (Sigma, St Louis, MO, USA) and 1% Sodium Pyruvate (Sigma). All cell lines were kept at 37 °C in a 5% CO2/95% air humidified incubator.
Lapatinib treatment and RNA extraction
Triplicate samples were grown to approximately 75% confluency. Treated samples were conditioned with 1 μM lapatinib for 12 hours. Control samples remained untreated. After the 12 hour incubation, the control and treated samples underwent RNA isolation using a Qiagen RNeasy mini Kit (Qiagen, Hilden, Germany) according to the manufacturer’s protocol and treated with Qiagen RNase-free DNase. cDNA template was then prepared from 2 μg of total RNA using an Applied Biosystems high capacity RNA to cDNA kit (Applied Biosystems, Foster City, CA, USA).
Taqman RT PCR
TaqMan gene expression experiments were performed in 10 μl reactions in Taqman Array 96 well fast plates which had been pre-seeded with assays for the genes of interest. 40 ng of cDNA template and 5 μl of Taqman fast Universal Master Mix (2x), no AmpErase UNG (Applied Biosystems, Foster City, CA, USA) were dispensed into each well. The following thermal cycling specifications were performed on the ABI 7900 Fast Real-Time PCR system (Applied Biosystems, Foster City, CA, USA); 20 s at 95 °C and 40 cycles each for 3 s at 95 °C and 30 s at 60 °C. Expression values were calculated using the comparative threshold cycle (C
t) method [
29]. Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) was selected as the endogenous control. The threshold cycle (C
t) indicates the cycle number by which the amount of amplified target reaches a fixed threshold. The C
t data for GAPDH was used to create ΔC
t values [ΔC
t = C
t (target gene)-C
t (GAPDH)]. ΔΔC
t values were calculated by subtracting ΔC
t of the calibrator (untreated controls) from the ΔC
t value of each target. Relative quantification (RQ) values were calculated using the equation 2
-ΔΔCt. Genes with a fold change ± 2 in the BT474 and SKBR3 cell lines were deemed to be differentially expressed.
Proliferation assay in vitro
Cells were cultured in 96 well flat bottomed plates for 24 h before they were exposed to a range of concentrations of lapatinib for 6 days (0–20 μM for the insensitive cell lines and 0–1.5 μM for the sensitive cell lines). The % cell survival was then determined using an Acid Phosphatase assay. Media was removed from plates, the wells were washed twice with PBS and the cells were exposed to 10 mM PNP substrate in 0.1 M sodium acetate for approximately 1 hour. The reaction was stopped using 1 M NaOH and the plates were read at 405 nm and 620 nm on the plate reader (Synergy HT, Bio-Tek). The % cell survival was calculated as a percentage of non-treated controls.
Discussion
In this paper, we describe the application of a method (CIA) for inferring the action of TFs by integrating the information provided by TFBS target prediction with mRNA gene expression data [
14] to identify possible markers for early lapatinib response. This is the first time this approach has been used to analyse an array data set derived from breast cancer cells treated with a targeted therapeutic. This multivariate statistical technique was applied to gene expression data incorporating time series data post treatment with high and low dose lapatinib in lapatinib-sensitive, HER2-positive cell lines (48 microarrays on both BT474 and SKBR3 cell lines). This method was initially used for data exploration to determine the gene expression response to lapatinib. This response appears to require a high dose of lapatinib in BT474 cells (1 μM lapatinib) and require low to high dose in SKBR3 cells (0.1 μM or 1 μM lapatinib). Differential gene expression analysis at early times or lose dose lapatinib confirmed this, as we were unable to identify a substantial gene list at low dose lapatinib in BT474 cells and at the 2 hr time point in either cell line, providing a strong validation of our approach. Once the lapatinib response was determined, CIA was used in a supervised manner to identify 8 TFs associated with response to lapatinib. It is important to note that none of these TFs were associated with the lapatinib response through standard differential expression analysis and their prioritisation here was only achieved via the novel use of the CIA method in this breast cancer dataset. Statistical overrepresentation of these TFs in the promoters of the 421 differentially regulated genes was used to further confirm of the validity of the approach we used here. 4 of the 6 motifs (representing the 8TFs) were statistically overrepresented in the lapatinib responsive gene list (PAX9, PAX3, Ahr/ARNT and ZNF143). While OLF-1 and RAR/RXR expression levels were not statistically significant within this gene list, it is not unexpected, as CIA is not restricted to a specific gene list but rather uses the entire microarray data as input. CIA is therefore not limited by arbitrary cut-offs which may exclude important TFs of interest. Overall the target genes of the TFs identified by CIA show higher than expected modulation by lapatinib, even though the TFs themselves are not differentially regulated.
These 8 TFs and an additional 19 putative markers were then validated using qPCR in a panel of breast cancer cell lines following treatment with 1 μM lapatinib for 12hours. The results suggest that the 5 genes RB1CC1, NR3C1, FOXO3A, ERBB3 and CCND1, which had been found to be differentially regulated in response to lapatinib treatment could be utilised as potential markers for early lapatinib response as their expression correlates with the sensitivity of the cell lines to lapatinib.
The expression of 6 TFs, AHR, ARNT, RXR, RAR, PAX9 and ZNF143 were found to be altered across all the cell lines in response to lapatinib treatment. These TFs are putative regulators of the cellular response to lapatinib and are predicted to target a number of the significantly differentially regulated genes. The expression of these TFs does not follow a set pattern but do follow some distinct trends as mentioned above, however, the regulation of gene expression by TFs is difficult to discern directly from the expression pattern of the TF genes themselves. All of these TFs have been previously demonstrated to play important roles in cancer, although their function in HER2-positive breast cancer is unclear. The AHR/ARNT heterodimer has been implicated as having importance in ER positive breast cancer and has been shown to directly associate with estrogen receptors ER-alpha and ER-beta [
30,
32,
33]. Retinoids targeting the RXR/RAR heterodimer have marked affects on cellular processes such as proliferation and apoptosis and this has been shown both
in vivo and in vitro in breast cancer models [
34]. The RARA receptor has also been recently identified as being co-amplified with HER2 in some breast cancers [
35]. While being known oncogenes, PAX9 and ZNF143 have not been extensively studied in breast cancer [
36,
37] and none of these TFs have previously been implicated in the response to lapatinib.
From the panel of 5 genes, 4 were upregulated in response to lapatinib, RB1CC1, NR3C1, FOXO3A and ERBB3. The expression of these genes correlated with the sensitivity of each cell line to lapatinib. The results show that the more sensitive that the cell line is to lapatinib, which was determined using proliferation assays, the greater the magnitude of up-regulation of the 4 genes. The genes then “switch” to down-regulation in the remaining two lapatinib insensitive cell lines (MDAMB453 and MDAMB231). In the case of CCND1, this switching phenomenon is not evident; rather the expression of CCND1 becomes less down-regulated as the level of lapatinib sensitivity decreases.
All 5 of the genes have been previously demonstrated to have importance in cancer. RB1 inducible coiled-coil 1 (RB1CC1) expression has been shown to be associated with long term survival of breast cancer patients and has been found to have a role in the inhibition G1-S progression and proliferation in breast cancer cell lines [
38,
39]. NR3C1, a glucocorticoid receptor, has been associated with poor response to treatment in multiple myeloma samples [
40]. Up-regulation of ERBB3 (HER3) has been connected with invasive breast carcinomas and also drug resistance in some HER2-overexpressing cancers [
41].
FOXO3A and CCND1 have been demonstrated to be important in both breast cancer and the lapatinib response [
16,
42]. FOXO3A and CCND1 were both shown by [
16] to be differently expressed following treatment with lapatinib. This group reported up-regulation of FOXO3A in both BT474 and SKBR3 and also a down-regulation of CCND1 in the same cell lines. These results are consistent with the results obtained by our study. It should be noted that CDKN1B was also differentially expressed in response to lapatinib in our study, [
16] although its dysregulation did not correlate with lapatinib sensitivity (Additional file
6 figure S1). The authors identified that these three genes all played roles in the regulation of the AKT pathway, both positive and negative. They noted that the down regulation of CCND1 and that the upregulation of CDKN1B in response to lapatinib could be as a result of a FOXO3A-dependent mechanism, which promotes lapatinib-induced apoptosis. However, they did not examine the expression of these genes in other lapatinib sensitive cells lines nor did they observe that the expression of these genes correlated with the sensitivity of the cell lines to lapatinib. They also observed additional changes in response to genes associated with a number of cellular processes such as glycolysis and cell cycle regulation.
Interestingly, CCND1 links all of these genes together both at the TF level (it is predicted to be targeted by AHR/ARNT, RXR/RAR and PAX9) and at the gene level via several interactions. CCND1 was downregulated in response to lapatinib in our panel of cell lines which is consistent with previous studies [
43‐
45] and which may also be related to its known interactions with our other genes of interest. FOXO3A has been shown to down-regulate CCND1 during cell cycle inhibition [
46], while Erbb3 receptors are thought to be required to reduce transcription of E2F mediated transcription factors, which include CCND1 [
47]. NR3C1 has been noted to inhibit CCND1 activation, through the TCF/ß-catenin complex [
48,
49] and RB1CC1 has been found to decrease the expression of CCND1 by promoting its degradation [
39]. Also, AHR/ARNT has been shown to regulate cell cycle progression via a functional interaction with CDK4/CCND1 [
32] and retinoids (RXR/RAR receptor ligands) are known to inhibit CCND1 expression [
50].
Of the 5 lapatinib-responsive genes, FOXO3A and CCND1 were previously described in lapatinib-treated BT474 and SKBR3 cell lines by the group that generated the original microarray dataset [
16]. However, the inclusion of the additional 4 cell lines allowed us to examine the 5 differentially expressed genes in the context of cell lines with varying sensitivities to lapatinib. The upregulation of RB1CC1 and NR3C1 in response to lapatinib has not been previously observed, while only limited work has been performed on ERBB3, FOXO3A and CCND1 in this setting. While the analysis described in this work is of a descriptive nature, a number of these genes including, FOXO3a (Mickey C.-T Hu
et al.) and ERBB3 (Liu, B
et al.) have been successfully functionally validated as being important in breast cancer response [
51,
52] .
The methods we have employed represent an attractive approach to dissection of the underlying gene expression changes associated with the response of cellular models of breast cancer (with differing inherent sensitivity) to lapatinib treatment. Our experimental design generated a list of gene changes that directly correlate with response to lapatinib in breast cancer. Since the list is highly enriched for genes likely to be of importance in lapatinib response, our findings therefore represent interesting candidates as biomarkers of response or functional targets for therapeutic intervention to improve response/overcome resistance.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
FON and SFM contributed equally to this work. SFM performed all of the bioinformatic/statistical analysis. FON treated the cells with lapatinib extracted the RNA and performed Taqman RT PCR and the proliferation assay. STA participated in the study design, RNA extraction and TaqMan RT PCR and analysis and interpretation of the results. FON, SFM, STA, JC, MC ROC and PD contributed to the result interpretation and manuscript preparation. ROC and PD equally conceived the study, participated in its design, coordination and interpretation of the results and finalized the manuscript. All authors read and approved the final manuscript.