Background
Chronic Obstructive Pulmonary Disease (COPD) is a chronic and progressive inflammatory lung disease with cigarette smoking as the main risk factor. COPD is characterized by persistent airflow limitation and COPD patients suffer from severe respiratory symptoms, overall resulting in a poor quality of life [
1]. The global prevalence of COPD is 10.7% [
2], resulting in a high economic and societal burden. In 2015, 5% of all deaths globally were caused by COPD and it is expected that COPD will be the third leading cause of death worldwide in 2030 (WHO 2016).
The development of COPD is known to be associated with both genetic [
3‐
5] and environmental factors [
6] and their interactions [
7]. However, the genome-wide association studies (GWAS) and genome-wide interaction studies (GWIS) that have been performed so far identified variants in COPD susceptibility genes that only explain a very small part of the variation in the onset of COPD [
8]. As a consequence, the epigenome is increasingly recognized as an important link between changes to the inherited genome and environmental exposures such as cigarette smoke [
9]. The epigenome comprises several epigenetic mechanisms that affect gene expression without modifying the DNA sequence. These epigenetic mechanisms are highly dynamic and changes can be induced by environment exposures, diseases and ageing [
10].
One well-defined epigenetic mechanism is DNA methylation, which is tissue-specific and involves the binding of a methyl group to a cytosine base adjacent to a guanine base, a so called CpG-site. CpG rich sites are found in the regulatory regions of the DNA and methylation of CpG-sites in these regulatory regions leads to a decrease in gene expression [
11]. It has been shown that DNA methylation is highly affected by environmental exposures such as air pollution and cigarette smoke [
12‐
14]. Next to the fact that exposure to cigarette smoke is an important risk factor for COPD, it is also strongly associated with lower lung function levels [
15,
16]. Hence, we postulate that DNA methylation plays an important role in the etiology of COPD by mediating the effect that cigarette smoking has on lung function levels. In this study, we aim to identify these CpG-sites by performing an epigenome-wide association study (EWAS) in whole blood in current smokers and validate these smoking-related differences in DNA methylation in lung tissue.
Discussion
In this study, we showed that DNA methylation at 15 CpG-sites was significantly associated with pack years. Next to previously described CpG-sites, we identified 2 novel CpG-sites: cg22994830, annotated to PRKAR1B, and cg20451986, located in the intergenic region of chromosome 11. 10 CpG-sites were additionally associated with lung function levels and we validated 5 of these CpG-sites in lung tissue. We found several significant associations between DNA methylation and gene expression in lung tissue. Moreover, we found novel associations for the CpG-sites located in the intergenic regions of chromosome 2 and 6 and biological plausible genes in lung tissue.
In our EWA study, we identified 2 novel CpG-sites associated with pack years. One of them, cg22994830, is located in the body of
PRKAR1B. As a regulatory subunit of cyclic AMP-dependent protein kinase A, this gene is involved in several cellular events including ion transport, metabolism and transcription.
PRKAR1B has shown to be implicated in neurodegenerative disorders, however, a role for
PRKAR1B in pulmonary diseases is currently unknown [
25]. The other novel CpG-site cg20451986 has not been annotated to a gene yet. The gene that is closest to this CpG-site, at approximately 10,000 base pairs, is
JAM3, a gene involved in cell-cell adhesion. While we performed an EWA study on the association between DNA methylation and pack years as a cumulative measurement of exposure to cigarette smoke, as reviewed by
Gao et al, earlier studies investigated differences in DNA methylation levels between never and current smokers [
22]. Nevertheless, we identified several CpG-sites that have been described before. Our third most significant CpG-site cg03636183 located in the gene
F2RL3, was one of the first sites that was found to be associated with smoking status [
23]. In addition, a study by
Zeilinger et al showed associations of CpG-sites located in the genes
AHRR, GFI1, F2RL3 and the unknown intergenic regions at chromosome 2 and 6 with smoking status [
12]. Nine of our identified CpG-sites were identical to the CpG-sites discovered by Zeilinger’s study.
A major strength of our study is that we validated our findings in lung tissue, the actual tissue of interest. This is in contrast to previous studies in which the most significant CpG-sites from the EWAS were validated by replication in whole blood of other populations [
12,
13,
23]. In our study, DNA methylation at one of the CpG-sites located in the body of
AHRR, cg21161138, was significantly lower in current smokers compared to never smokers in lung tissue, in line with previous findings by
Shenker et al [
24]. In addition, the association between DNA methylation of cg21161138 and lower gene expression of
AHRR in lung tissue also confirms findings of that earlier study. However, we were able to validate 4 additional CpG-sites in lung tissue. The fact that we could confirm a total of 5 CpG-sites, identified in whole blood, in lung tissue suggests that changes in DNA methylation in response to cigarette smoking directly occur at the local level in the lung. For cg03636183 located in the body of
F2RL3, we neither found differences upon smoking in lung tissue nor an association between DNA methylation levels and gene expression. Since the intensities of the nucleotide peaks in the pyrosequence run of cg03636183 were low compared to other assays, despite the use of several different primers sets, we cannot exclude that actual differences in DNA methylation in lung tissue are masked by technical issues.
Several cross-sectional studies strongly suggest that DNA methylation in whole blood can be partially normalized upon smoking cessation [
21,
26,
27]. In our study, we found that DNA methylation levels of ex-smokers are between the levels of never and current smokers for the 5 CpG-sites that are differentially methylated upon smoking. Moreover, with the eQTM analysis, we showed that for
AHRR both DNA methylation and gene expression levels of ex-smokers tends more towards the levels observed for never smokers compared to current smokers. Even though this is cross-sectional data, it implies a reversible character of DNA methylation.
After identification and validation of CpG-sites that are associated with exposure to cigarette smoke in lung tissue, an important next step is to assess the functional relevance of the CpG-sites. For
AHRR, the function is well studied. Briefly,
AHRR inhibits the aryl hydrocarbon receptor pathway, which is involved in the removal of harmful environmental chemicals, including cigarette smoke. Cigarette smoking results in decreased DNA methylation and this decreased DNA methylation subsequently increases the expression of
AHRR. This leads to reduced removal of harmful compounds and thereby thus potentially increasing the damage caused by these compounds [
28]. In contrast, the other 3 CpG-sites that are different with smoking status in lung tissue have not yet been annotated to a gene, making it impossible to assess the potential function of DNA methylation at these sites. By performing an eQTM analysis, we tried to identify the genes that are regulated by these CpG-sites. For the 2 CpG-sites at chromosome 2, we found associations with gene expression of
ATG16L1 and
DIS3L2.
ATG16L1 is an essential component of the autophagy pathway and mutations in this gene have been associated with inflammatory bowel disease. Since proper function of
ATG16L1 is necessary for host-defense responses against micro-organisms and inflammatory responses in Crohn’s disease, this gene might be relevant in the respiratory system as well [
29].
DIS3L2 belongs to the family of exo-ribonucleases, key enzymes involved in the control of messenger RNA stability.
DIS3L2 has been associated with human diseases such as Perlman syndrome and Wilm’s tumor, but a role in pulmonary diseases has not been described [
30]. For cg06126421 located at the intergenic region of chromosome 6, we found an association between DNA methylation levels and the expression of the genes
TUBB and
MUC21.
TUBB encodes a beta tubulin protein of the microtubule cytoskeleton and has shown to be involved in microcephaly with structural brain abnormalities in humans. A role for
TUBB in lung-related pathologies, however, is currently unknown [
31].
MUC21 belongs to group of 22 mucin proteins, the major glycoprotein components of mucus, which forms the protective layer of the epithelial surface. Since overproduction of mucins is associated with common respiratory diseases including COPD, asthma and cystic fibrosis [
32],
MUC21 is a biological plausible gene and further investigations into the inverse association between DNA methylation at cg06126421 and the expression of
MUC21 are warranted.
While our study is one of the first studies to validate a large panel of CpG-sites in actual lung tissue, we have to consider some limitations of our study. First, the results of the mediation analysis should be interpreted with caution. In our study, we used mediation analysis to select CpG-sites that were associated with lung function levels rather than implying biological mediation. We used pack years as cumulative measurement of the exposure to cigarette smoke to assess the association between cigarette smoking and DNA methylation. However, the number of pack years is estimated from self-reported information obtained from questionnaires of the LifeLines population-based cohort study. It has been suggested that self-reported estimations of smoking may underestimate the true smoking prevalence, since cigarette smoking is often interpreted as socially undesirable behavior [
33]. Moreover, it has been stated that DNA methylation levels at specific CpG-sites in the genes
AHRR and
F2RL3 are a better estimate for the exposure to cigarette smoke than pack years derived from self-report [
34]. Within the mediation analysis, this potential misclassification of exposure to cigarette smoke may lead to overestimation of the mediation effect [
35]. Furthermore, with the cross-sectional design of our study, we cannot infer causality from the mediation analysis. A second potential limitation of our study is that most of the lung tissue was obtained from tumor resection surgery. Although all the tissue has been histologically checked for abnormalities, DNA methylation might be affected by the tumor and potential metastases. However, since this holds for all groups under study, we assume that it will not lead to differences in DNA methylation between the groups.