Background
Oral squamous cell carcinoma (OSCC) is the sixth most common cancer worldwide. The presence of lymph node metastasis is associated with a 50% decrease in 5-yr survival, and is the single most important prognostic factor identified to date [
1‐
4]. However, the mechanisms by which OSCC cells spread from the primary site to local lymph nodes is not well understood. Transcriptome profiling has been used to gain insights into this process [
1,
5‐
7], but the function of many of the proposed differentially expressed transcripts is unknown. To improve the likelihood of finding genes driving the carcinogenic process, several groups have exploited the common feature of genomic instability in cancer [
8] and identified genes the expression of which is correlated with corresponding DNA copy number in tumors such as brain, breast, ovarian, liver, multiple myeloma, and melanoma [
9‐
18].
In an attempt to identify novel driver genes responsible for the OSCC metastasis, we utilized a recently developed protocol by our group for high-throughput profiling of DNA and RNA from the same cell population obtained by laser capture microdissection (LCM) to determine the association between DNA copy number aberration (CNA) and gene expression in tumor cells isolated from metastatic lymph nodes. We reasoned that these cells would contain those changes in the genome and transcriptome that are essential to the lymphotropism of OSCC. In addition, we tested the hypothesis that since nodal metastases are associated with poor prognosis, the expression of copy number-associated genes from metastatic OSCC tumor cells is associated with survival.
Discussion
In this study, we analyzed the relationship between DNA copy number and gene expression in tumor cells from metastatic lymph nodes of 20 OSCC patients to: 1) identify the genes showing a significant correlation between DNA copy number and gene expression, and 2) determine which if any copy number-associated genes from metastatic OSCC were associated with OSCC status and survival. We used the same cells to isolate and amplify DNA and RNA for high-throughput profiling whereas in previous studies, DNA and RNA were interrogated either from biopsy samples, or if tumor-cell specific, from two different laser microdissected cell populations. To our knowledge, this is the first time the genome and transcriptome of a solid tumor have been profiled and integrated using the genetic materials from the same cancer cell population enriched by LCM. This is also the first integrative analysis of DNA copy number and gene expression of tumor cells from metastatic OSCC lymph nodes.
There are a number of unresolved issues when interpreting data that integrate genome-wide DNA copy number and gene expression profiles. Study design, statistical analyses and quality of specimens are all factors that can all impact in how inferences are drawn between DNA copy number and gene expression. An early study in breast cancer indicated that at least 12% of all the variance in gene expression could be directly attributable to the underlying variation in DNA copy number [
11]. As the authors pointed out, however, that was a significant underestimate as DNA was obtained from tissue biopsies with mixed tumor and stromal cells, reducing the ability to detect tumor-associated CNA. In a more recent study of non-small-cell lung carcinoma, 42% of the genes were classified as DNA copy number-driven [
23]. However, the study did not enrich for tumor cells and dichotomized genome CNA as loss and gain only, thereby limiting the assessment of DNA copy number's impact on expression. Another breast cancer study that enriched the tumor cells via microdissection detected 46.7% of genes with a significant correlation between copy number and gene expression. However, that study did not appear to have adjusted for multiple comparisons and this may lead to over-estimation of the impact of copy number on gene expression [
24]. Regardless of these limitations, data from these studies and ours does point that, in general increased DNA copy number was associated with increased expression. A critical issue is how to distinguish those copy number-associated genes which are the true "drivers" of carcinogenesis from those without functional relevance. We believe that determining which copy number-associated transcripts are associated with clinical characteristics and survival could be one approach to prioritize these transcripts further.
One interesting finding is the complexity seen between DNA copy number and gene expression for individual CNA regions. In the same DNA region, it was not unusual to find a mix of both genes the expression of which did and did not show a correlation with CNA. For instance, six genes and an unknown transcript located in the high correlation 11q13.2-q13.3 region showed a strong positive correlation between DNA copy number and gene expression, whereas the expression of five other genes located in the same region can not be explained by their CNA. Previous studies have linked five of the six copy number-associated genes in the 11q13.2-q13.3 region to cancer progression, including: 1) the genome amplification and overexpression of
ORAOV1 and
PPFIA1 associated with aggressive phenotypes in OSCC cell lines [
20,
25]; 2) the expression of the cell apoptotic signal mediator
FADD associated with nodal metastasis in non-small cell lung cancer [
26] and poor survival in laryngeal carcinoma [
27]; and 3) the expression of
CPT1A and
MRPL21 associated with the development of colon cancer and breast cancer, respectively [
28,
29]. Therefore, at least some associations between DNA copy number and gene expression can be shown to have clinical or pathogenetic relevance, as opposed to only representing a DNA dosage effect.
Many of the gene candidates identified in this study have mechanistic relevance in regards to how cancer cells migrate from the primary organ to distant sites and survive through this process and in their new environment. For example, out of the 116 OSCC specific genes, at least 20 are involved in either cell cycle or cell death. Similarly, for the 24 genes associated with survival, 6 genes are involved in cell death and 3 genes are involved in cell migration. The expression changes of these genes, which our study suggests occurs as a result of copy number alterations, may enhance the viability and migration capability of the cancer cells, thus assisting in the progression and spread of the OSCC. Further investigation of these genes to confirm their roles in OSCC metastasis is warranted.
Although genes with CNA-associated expression were located on all chromosomal arms studied, several clusters of such genes were observed (e.g., in the 11q13 region as described above and other regions such as 7q22.1 and 9p24.1 (see Figure
2)). Likely, a single CNA event in these regions would cause the dysregulated expression of multiple genes. It is generally believed that recurrent genome CNA regions harbour genes that are essential to cancer progression, but many current functional studies focus on identifying the single "key driver gene" in the region of CNA responsible for cancer progression. In a recent study on liver cancer, however, the over-expression of both
cIAP1 and
Yap, as a result of genome amplification at mouse chromosome 9qA1, cooperatively promoted tumorigenesis [
16]. In another study of glioblastoma, chromosome 12q13.3-14.1 amplification caused over-expression of genes (
CDK4 and
CENTG1) and a microRNA (
hsa-miR-26a), all of which were demonstrated to contribute to the progression of the cancer [
30]. Thus, CNA in these regions may be an efficient mechanism for cancer cells to obtain functional benefits from multiple genes (and even microRNAs) to achieve a specific aberrant capability. Likewise, amplification at 11q13 is one of the most frequently observed CNA events in almost all solid tumors [
31]. This single CNA event would likely cause the over-expression of genes such as
ORAOV1,
PPFIA1, and
FADD, which, as mentioned above would synergize to promote OSCC progression. For this reason, genes with copy number associated expression in the regions of CNA reported in this study should be considered in a comprehensive way as concurrent dysregulation of these genes could contribute to the lymphotropic phenotype of these metastatic tumor cells.
The fact that among CNA-associated genes found in tumor cells isolated from nodal metastases, we found subsets that were significantly associated with OSCC status and survival using expression profiles of biopsy samples from primary tumors which were not laser-dissected suggests that the signal from our copy number-associated genes was robust enough to overcome "noise" from the expression signal of bystander cells. Moreover, the fact that the two survival models incorporating the gene expression of the '122- or 27-transcript PC' resulted in higher AUCs than a model with stage alone suggests that at least some of these copy number-associated transcripts may have prognostic relevance and could aid in explaining some of the outcome variation among tumors of the same stage. Thus, although there may be some genes irrelevant to the cancer process differentially expressed due to a dosage effect of CNA, our data demonstrate that integration of genome CNA with expression in tumor cells with an aggressive phenotype (i.e. lymphotropism) could uncover candidate biomarkers or therapeutic targets. However, larger scale integrative analysis with more patient tissue samples is needed for further validation and new discoveries of DNA copy number-associated genes. In addition, functional studies are warranted to assess the biological roles of these DNA copy number driven genes in the OSCC progression.
Acknowledgements
This work was supported in part by grants 5KL2RR025015-03 from National Center for Research Resources, National Institutes of Health (NIH); Amos Medical Faculty Development Program Award from The Robert Wood Johnson Foundation; Early Physician-Scientist Career Development Award from the Howard Hughes Medical Institute; R01CA095419 from the National Cancer Institute, NIH; Small Grants Translational Research Projects Award from the Institute of Translational Health Sciences, University of Washington supported by grant UL1RR025014 from the National Center for Research Resources, NIH; center funds from the Department of Otolaryngology - Head and Neck Surgery, University of Washington; and by resources from and use of facilities at the VA Puget Sound Health Care System, Fred Hutchinson Cancer Research Center, University of Washington Medical Center and Harborview Medical Center, Seattle, Washington.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
CX carried out sample processing; laser-capture microdissection; genome profiling; transcriptome profiling; Ingenuity Pathway Analysis; participated in data analysis design; and drafted the manuscript. YL carried out statistical analysis of estimating DNA copy number and its association with gene expression; identified associations between copy number-associated genes and OSCC status and clinical outcomes; and revised; and revised the manuscript. PW gave guidance on statistical methodologies and tools; provided insightful explanation of the analytical results and major revisions to the manuscript. WF carried out the gene expression analysis; principal component analysis; and participated in data analysis design and revision of the manuscript. TCR carried out survival prediction modelling and participated in revision of the manuscript. MPU provided pathological support and participated in revision of the manuscript. JRH performed sample management and sample processing; and participated in revision of the manuscript. PL carried out the extraction of clinical information and participated in revision of the manuscript. DRD prepared datasets and conducted statistical analyses of clinical and outcomes data. NDF carried out patient recruitment and sample collection; and participated in overall design. LPZ participated in overall design and provided statistical guidance to WF for analysis of the gene expression data. SMS provided epidemiological support, guidance to DRD, and participated in overall design, planning of the data analysis; and provided major revisions to the manuscript. CC, as the principal investigator of the parent study (NIH RO1CA095419-06A1), provided the biospecimens and study participants-associated relevant data that were obtained through in-person baseline and follow-up interviews and medical record abstraction; provided guidance to PL in medical chart abstraction of clinical information; provided guidance to JRH to carry out sample management and processing; participated in overall study design and data analysis design; and provided major revisions to the manuscript. EM carried out overall study design; coordination of collaborations; protocol development; sample processing; experimental design; transcriptome profiling; Ingenuity Pathway Analysis; and data analysis design; provided guidance to CX to execute study design and to PL in medical chart abstraction for clinical information; and drafted and provided major revisions to the manuscript. All authors read and approved the final manuscript.