Introduction
Lung cancer is the most frequent cancer among men and the third most frequent cancer type among women worldwide [
1]. Patients diagnosed with lung cancer have a high mortality, and the disease causes almost as many lost life-years as colon, prostate and breast cancer combined [
2]. Non-small cell lung cancer (NSCLC) accounts for about 85% of all lung cancer cases, and tumors with adenocarcinoma histology are increasing in incidence. Genetic alterations found in lung adenocarcinomas are clinically important, and
EGFR mutations and translocations involving
ALK, ROS and RET genes are currently targets for therapy [
3,
4].
TP53 mutations are seen in approximately 50% of NSCLC [
5], and the potential predictive and prognostic value of
TP53 mutation status is debated [
6]. Chromosomal abnormalities are frequent events in NSCLC tumors, and both mutations and copy number aberrations can be main drivers of the disease [
7]. Specific patterns of copy number gains and losses have been associated with different cancer types [
8‐
10], and linked to histological subtypes of lung cancer tumors [
11]. In breast cancer tumors, focal complex events characterized by multiple closely spaced aberrations seen in genome-wide copy number profiles, have been described and associated with prognosis [
12]. Focal complex events has been reported to be more frequent in lung adenocarcinomas compared with other histological subtypes of lung cancer tumors [
11], but thorough studies of chromosomal architecture in NSCLC tumors are lacking. Different copy number profiles in lung adenocarcinoma tumors with and without mutations in
EGFR or
KRAS have been described [
13‐
15], but information about structural events have not been included in these studies. The effect of copy number aberrations on carcinogenesis is complex and some reports have shown that the expression of genes located in chromosomal regions involved in copy number alterations varies consistently with the DNA copy number [
16] suggesting that these alterations can affect the expression of oncogenes and/or tumor suppressor genes [
8,
9]. The affected genes may further act together to alter cellular signaling pathways in the malignant cells [
17,
18].
In this study, we aimed to characterize the genomic architecture of the NSCLC tumors. Copy number data were obtained by high-resolution SNP arrays on 190 tumor samples from operable NSCLC patients. We further analyzed the complexity of the tumor genomes based on the allele-specific copy number profiles. Substantial differences were found between subgroups of samples when stratified on the basis on histology, smoking history, EGFR-, KRAS- and particularly TP53 mutation status. Furthermore, by integrating gene expression data from a subset of 113 lung adenocarcinoma samples, we identified genes for which the expression was affected by copy number and subsequently identified the cellular pathways most enriched for such genes.
Discussion
The current study presents an exploratory whole-genome investigation of copy number alterations including the genomic architecture of NSCLC tumors. The copy number data were obtained using high-resolution SNP arrays of 190 early-stage NSCLC tumors. By stratifying the samples into biological relevant subgroups, we identified large differences particularly in the TP53-mutated tumors that displayed a considerable number of gains, losses and focal complex events both in the genome-wide and arm-wise analyses. Integration of DNA copy number with mRNA expression data showed that genes with expression influenced by the copy number were associated with important cellular signaling pathways previously not known to be driven by copy number change.
Overall, the NSCLC tumor tissue displayed complex DNA copy number profiles with many gains, losses and focal complex events throughout the genome. The global copy number profile was comparable to those seen in similar studies of NSCLC tumors [
10,
31,
32]. The most common regions with gains were located at chromosome arm 1q and 5p. Chromosome arm 1q includes
MDM4 and
RIT1 genes, and by integrating mRNA data, we found that expression of
MDM4 correlated with increased copy number leading to up-regulation of the expression of this gene.
MDM4 is important in carcinogenesis, and increased
MDM4 activity can suppress the
TP53 activity allowing the cancer cells to proliferate [
33]. Mutation in
RIT1 has recently been identified in lung adenocarcinomas and
RIT1 is proposed to be a driver oncogene in a specific subset of lung adenocarcinomas [
34]. This gene was recurrently gained in our data and may be an alternative path for oncogenic activation. The commonly gained 5p region includes
DROSHA whose mRNA expression is correlated in-cis in our data. This gene is a crucial regulator of microRNA expression, and increased expression of
DROSHA has been linked to poor prognosis in lung cancer [
35,
36]. Other recurrently gained regions were 7p that includes the
EGFR gene, and this region was frequently gained both in
EGFR-mutated and wild type tumors. Chromosome arm 3p was lost in a high proportion of samples. This is a known event in lung cancer tumors, and was among the first chromosomal abnormality to be recognized originally identified by karyotyping [
37].
Despite distinct histological differences, squamous cell carcinomas and adenocarcinomas show remarkable similar copy number patterns with no significant differences in genome-wide scores. In the arm-wise analysis, however, squamous cell carcinomas had a higher gain score at 3q as compared with adenocarcinomas. The squamous cell carcinomas also have significantly increased asymmetry-score at 3q, indicating an asymmetric gain in this chromosome arm. The lung adenocarcinomas had significantly more gain at chromosome arm 1q and 5q and a greater loss combined with LOH at 6q and 12p. The copy number differences between squamous cell carcinomas and the adenocarcinomas of the lung have been studied by others, and the increased gain at 3q in the squamous cell carcinomas have been reported in several studies [
38,
39]. The increased gain at 5q in the lung adenocarcinomas has also been reported previously by Staaf et al. [
11]. This study is the only other study which has included focal complex events in NSCLC tumors, and they identified more focal complex events in squamous cell carcinomas. This was not confirmed in our data, but the limited number of squamous cell carcinomas included in this study might serve as an explanation. Furthermore, we find that gain often occur together with asymmetry, indicating that gain often is an asymmetric event with respect to the two alleles in NSCLC tumors. Similarly, loss and LOH often co-occur when the loss is not a copy neutral event.
Mutations in the
TP53 gene are common events in lung cancer tumors
. The
TP53-mutated tumors had significantly higher ploidy, estimated by the ASCAT algorithm, compared with the
TP53 wild type tumors. After adjusting for estimated ploidy and aberrant cell count, we performed comparison analyses and found that the
TP53-mutated lung adenocarcinomas had a significantly higher score at all eight indices in the genome-wide analysis (Fig.
3). The lung adenocarcinomas with mutant
TP53 gene have generally more segments deviating from the median ploidy with consistently more gains and losses throughout the genome. The
TP53-mutant tumors additionally have more focal complex events captured by the
steep and
curv scores, which may be contributing to the aggressive phenotype associated with the
TP53 mutations seen in other cancer types [
40]. The genome-wide LOH and asymmetry scores were additionally significantly higher in the
TP53-mutated tumors. Particularly interesting is it that nearly all
TP53-mutated tumors had
TP53 LOH, indicating that inactivation of both
TP53 alleles are important as proposed in Knudsons two-hit hypothesis [
41]. The LOH events in
TP53-mutated tumors were often accompanied by copy number gain of the mutant allele. In the
TP53 wild type tumors, the
TP53 LOH was seen in 31.4% of the samples, suggesting a dysfunction in the
TP53-pathway in a large amount of all lung adenocarcinoma tumors.
The high genome-wide scores in the
TP53 mutated tumors indicate a highly unstable genome. Other studies have demonstrated how
TP53 mutation status might reflect tumor mutation burden, and association with longer overall survival in patients receiving immunotherapy [
42]. This reflects that the well-known TP53 mutation status might be clinically important also in the future. The finding of the complex copy number profiles in
TP53-mutant lung adenocarcinomas in our study is very convincing, and we suggest that
TP53-mutation status should be considered implemented for biological stratification purposes, in studies involving genomic aberrations.
Lung adenocarcinomas with
EGFR mutation comprise a specific clinical subtype and are more frequent in women, never-smokers and patients with Asian ethnicity [
43]. To better understand the biology of the
EGFR-mutated lung adenocarcinomas we compared copy number profiles between
EGFR-mutated and wild type tumors. In the chromosome arm-wise analysis, we identified alterations of chromosome arm 7p that were gained in both
EGFR-mutated and
EGFR wild type tumors, and significantly more lost in the
EGFR wild type tumors. The
EGFR gene is located at 7p12, and this region was also significantly gained in both
EGFR wild type and
EGFR-mutated tumors, but with a significant higher number of total copies in
EGFR-mutated tumors (Fig.
4). When integrating mRNA expression data, we found that the
EGFR mRNA expression was correlated with gained copy number, and previous studies have also shown that the copy number alterations in chromosome 7 are correlated with protein expression and activation of the
EGFR pathway [
44].
EGFR mutation is a strong predictive biomarker for tyrosine kinase inhibitor response. It is however debated if copy number gain may act as a predictive marker for EGFR
-TKI response in patients with
EGFR wild type lung cancer tumors [
45,
46]. The aberrations of chromosome arm 7p seem to be consistent with previous reports [
13,
14,
47]. We also found that the
EGFR wild type lung adenocarcinomas had significantly more gain at 9q, an aberration difference not described earlier. Other copy number differences in gains and losses between
EGFR-mutated and wild type tumors have been described previously [
13,
14,
47], but were not validated in our study. The lack of consistency may be caused by small sample sizes and the use of different methods to call gains/losses. The scores in our analyses include both the magnitude and the width of the aberrant region into the calculation of
gains and
losses, which makes it challenging when comparing the results with other studies. To our knowledge, focal complex events have not been described in relation to
EGFR,
KRAS and
TP53 mutation status, and the clinical impact of such events in lung cancer tumors is not known. The arm-wise analysis identified more focal complex events (reflected in
steep and
curv scores) in the
EGFR wild type tumors with significantly higher
curv scores at 3p, 5p, 11q and 12q. The same trend was seen in the
KRAS wild type tumors, which both had significantly higher
steep and
curv scores at the genome-wide analysis and at specific chromosome arms. The
EGFR- and
KRAS wild type adenocarcinomas had additionally significantly more arm-wise aberrations compared with the
EGFR- and
KRAS-mutant lung adenocarcinomas, suggesting that tumors without mutational activation of these oncogenic pathways are more driven by copy number aberrations than of point mutations. The
TP53-mutated tumors had the opposite pattern with more focal complex events in the tumors harboring a
TP53 mutation. This was consistent with the findings in the pan-cancer study by Ciriello et al., which found
TP53 mutations enriched in the C-class (copy number driven) tumors [
7].
The aberrational pattern is similar across different studies of NSCLC tumors. Previous studies have shown that the expression of genes located in chromosomal regions involved in gains or losses varies consistently with the DNA copy number [
48,
49]. We approached this by first investigating how the gene expression is affected by copy number alterations and secondly to study whether any known cellular pathways are overrepresented in the list of affected genes and hence probably regulated by copy number. Two of the most significantly affected pathways were the
mTOR- and
EIF2- signaling pathways, both related to the
PIK3CA gene. Mutations of the
PIK3CA gene occur in lung adenocarcinoma tissue, but is a relatively seldom event [
50]. We found
PIK3CA frequently gained and the gene expression significantly correlated with gain in copy number. Among the genes associated with the
mTOR-pathway, forty-five were cis-genes. The expression of several important oncogenes such as
AKT1,
AKT2, and
KRAS were positively correlated with the copy number in our analysis. The
mTOR gene and several of its effectors (
RPS6KB1 (p70S6K),
RPS6, and
EIF4G1) were also altered. The PI(3)K-mTOR pathway was one of the key pathways found activated at a protein level in a large lung adenocarcinoma study by TCGA [
51]. In this paper the activation of the pathway was partly explained by mutations (in
PIK3CA or
STK11), but some samples with increased pathway-activation lacked known underlying mechanisms. We suggest that altered expression of cis-genes affected by underlying copy number aberrations may increase the activity of this pathwayDrugs that target the mTOR pathway have shown interesting results in other cancer types [
52], which highlight its clinical importance. Clinical trials in lung cancer targeting the PI(3)K-mTOR pathway have shown variable responses when given as monotherapy [
53‐
55]. The lack of responses in some patients may be due to the complex regulation of the pathway and interplay with other oncogenic pathways [
54].
The chromosomal structure in lung carcinomas is highly aberrant and copy number alterations in tumor or in cell-free DNA might predict response to immunotherapy in cancer patients [
56]. The findings in this study encourage further research of whole genome copy number alterations and to increase the biological understanding and of therapeutic approaches targeting the PI(3)K-mTOR pathway.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.