Background
DRD4, which encodes a G-protein-coupled dopamine receptor[
1], has been widely implicated in the etiology of neuropsychiatric disease. Genetic associations have been reported between
DRD4 and ADHD[
2‐
6], anorexia[
7], schizophrenia[
8,
9], depression[
10,
11], obesity[
12], addiction[
13] and personality disorders[
14]. Although some of these genetic associations are supported by evidence of altered
DRD4 expression in specific diseases[
11,
15,
16], the mechanism by which
DRD4 polymorphisms influence behaviour remains unknown. Epigenetic functions may offer some explanation. Epigenetics refers to the reversible regulation of various genomic functions mediated through partially stable modifications of DNA and histone codes, excluding DNA sequence changes. Epigenetic processes, including histone modification and DNA methylation, are intrinsically connected to gene expression, allowing the regulation of gene function through non-mutagenic means[
17]. The methylation of CpG dinucleotides, which are overrepresented in the promoter regions of many genes, acts to obstruct cells’ transcriptional machinery and silences gene expression. Correct control of DNA methylation is vital to normal cellular function, and DNA methylation dysfunction has been linked to a number of human pathologies[
18,
19], including complex neuropsychiatric phenotypes such as schizophrenia and bipolar disorder[
20]. Though stochastic factors have been implicated[
21], there is growing evidence for the importance of both environmental and genetic factors in the influence of DNA methylation.
Studies of twins suggest greater variability in the DNA methylation patterns of dizygotic (DZ) twins relative to monozygotic (MZ) twins, and heritability estimates of 0.20-0.97 have been generated for DNA methylation levels within various genomic regions[
22,
23]. A SNP in
MTHFR – the gene encoding 5,10-methylenetetrahydrofolate reductase which is involved in the maintenance of DNA methylation patterns – has been linked to global DNA methylation levels[
24,
25]. Furthermore, several studies have demonstrated
cis-acting genetic associations with DNA methylation in humans, chimpanzees and mice[
20,
26‐
34]. Crucially, as these genetic associations with DNA methylation levels have also been shown to correlate with levels of gene expression[
33‐
36], they could represent the mechanism behind allele-specific gene expression, which has been commonly reported throughout the genome[
37‐
40].
A function for previously unexplained genetic associations may therefore lie in the connection between DNA methylation and DNA sequence. If this is the case, further investigation of such markers might involve assessing their influence over local DNA methylation patterns. Yet several studies of DNA methylation report contradictory findings, indicating that the importance of genetic factors may vary across genomic regions, tissues and environments[
23,
41,
42]. Thorough analysis in relevant tissues is therefore necessary to draw conclusions about any one gene of interest.
In the present study, we assess the potential influence of cis-acting genetic polymorphisms in mediating DNA methylation at 9 CpG sites across the DRD4 promoter region, using lymphoblastoid cell-lines from a familial sample, and post-mortem brain tissue from an independent set of individuals.
Discusion
Our investigation uncovered significant SNP associations with DNA methylation levels in the
DRD4 promoter, which were replicated at a nominal level of significance (p < 0.05) in an independent sample of post-mortem brain tissue. As with the majority of genetic effects identified in previous studies, these SNP associations occurred in
cis[
20,
26‐
32]. As
cis-acting genetic influence over DNA methylation has been observed throughout the genome, it may account for many previously unexplained genetic associations. Though we did not analyse
trans- acting SNPs in the present study,
trans genetic effects are also likely to be important[
26,
35].
The greatest group difference in DNA methylation was 16%, between the two rs3758653 homozygote groups. At this stage in our understanding, the functional relevance of these small changes in DNA methylation is unknown. If one is attempting to find an influential locus of large effect, substantial changes in DNA methylation levels may be expected. However, as most complex phenotypes are now thought to be influenced by a myriad of factors of small effect[
53,
54], more subtle differences in DNA methylation levels may be important. Our knowledge is still very limited, but as a ~20% difference in DNA methylation has been previously shown to associate with a 2-fold change in gene expression[
35,
55], or even to bring about the presence or complete absence of gene expression across various tissues[
56], it is likely that individual differences in phenotypic outcome will be cumulatively influenced by many small differences in the epigenetic, and consequently the transcriptomic, landscape. Detecting genetic influences over even more modest individual differences in DNA methylation than observed here will require far larger samples.
It is likely that
cis-acting DNA effects on DNA methylation are more important in some genomic regions than in others[
22,
26,
33,
35], and our findings suggest that
DRD4 may represent a region in which
cis- acting genetic-control commonly occurs. Although previous genomewide association studies of DNA methylation have not reported positive results from the
DRD4 region[
10,
17,
19], such investigations may be limited in the CpGs and SNPs they were able to investigate by the laboratory platforms used. The
DRD4 SNPs tested did not show
cis-associations with expression in the SCAN database[
51], and although we found
DRD4 SNP associations with DNA methylation in our independent replication sample, they did not withstand Bonferroni correction for multiple testing. Furthermore, the findings of a recent twin study of the same
DRD4-associated region assayed here are inconsistent with those we have reported. Wong et al.’s investigation of DNA methylation in 46 MZ and 45 DZ twins indicated no heritable element was involved[
42]. Differences in the sample and tissue types used across the two studies are likely to explain the disparate findings. Additionally, all investigations of this region to date – including the present study – have involved extremely small sample sizes. Future work will require far larger samples in order to draw firm conclusions.
One limitation of this study was the modest size of both the discovery and replication samples. Our discovery sample had 80% power to detect causal QTLs of 9.6% effect size, or markers in linkage disequilibrium (D’ = 0.8) with causal QTLs of 15.1% effect size. As subjects were drawn from the extensively characterized CEPH sample, we were more likely to have access to the genotypes of causal variants. Furthermore, one might expect
cis-acting SNPs to show larger effects over local DNA methylation than over complex disease phenotypes. Nonetheless, as DNA methylation levels are likely be subject to the effects of multiple environmental,
cis and
trans genetic, and also stochastic factors, far smaller effect sizes may be involved than the sample was equipped to detect. Though we excluded SNPs with MAFs below 5%, the MAF of one of the SNPs associated with DNA methylation was lower than 10% (rs752306 - see Table
1), further stretching the power of our sample.
Our replication sample was also limited in size, with the analyses involving the largest N of 13 having only 80% power to detect a causal QTL of 53% effect size. Though we did detect nominal SNP associations, we feel our replication sample was too small to draw final conclusions. Unfortunately, the laboratory techniques used to assess DNA methylation are relatively new and still rapidly developing, and both genetically and epigenetically assessing samples involves considerable cost and labour. Consequently, to date the investigations of genetic influences over DNA methylation have all involved similarly small sample sizes[
20,
22,
26,
30,
33,
35,
42]. Wisdom gained from studies of other complex phenotypes dictates that future studies should aim to include far larger sample sizes if they hope to detect the expected small effects[
53,
57,
58]. This wisdom can also be applied when interpreting the relatively large effect sizes (8.4-14.8%) we did manage to detect in our small discovery sample, and the even larger effect sizes (39-44%) we observed in our replication sample. Though many significant associations between candidate genes and complex traits have been identified and replicated over the years, the large effect sizes originally reported in discovery samples often fall as sample sizes, and number of replication studies, increase[
3]. We would therefore expect the effects found in any future investigations of larger samples to be smaller than those reported here.
Our small sample size also restricted the statistical analyses we were able to perform. Firstly, although we would not expect to find significant population stratification within our CEPH participants, the small sample size did not permit us to test this empirically. Secondly, though we controlled for the effects of genetic relatedness and nuclear family environment in our association analyses, our sample was too small to simultaneously estimate many variance parameters accurately. As a result, in many cases the effect of the family environment could not be reliably distinguished from the effects of genetic relatedness in our sample. Although the influence of the environment over DNA methylation is well documented[
59], we predicted that it would be less significant in the transformed lymphoblastoid cell line DNA used here. Indeed, after controlling for genetic relatedness in our analyses, the family environment often showed no additional influence over DNA methylation. Our modest sample size also left us unable to take parent of origin effects into account, which recent computational analyses suggest are prevalent across the genome[
60]. The stringent Bonferroni method used for multiple-testing correction in our replication sample may also be seen as a limitation, as the 5 tissues tested came from largely the same participants. However, DNA methylation across the 5 brain regions within individuals was uncorrelated, likely to be in part due to low levels of variation in our sample.
The quantitative DNA methylation data generated using the MALDI-TOF-based Sequenom EpiTyper technique were limited in a number of ways. As Additional File
1 demonstrates, due to the position of cut sites, after base-specific RNA cleavage some adjacent CpGs remained on the same fragment[
33,
34]. As a result, many of the CpG units assessed – including some of those exhibiting significant SNP associations – consisted of average DNA methylation measurements across several CpG sites. Additional File
1 also highlights the exclusion of numerous CpG-containing fragments for a variety of reasons. Since the present study was conducted, R packages such as RSeqMeth[
44] and MassArray[
61] have been created to assist researchers in designing assays which avoid at least some of this loss of data. Furthermore, an optimal method for examining associations between genetic markers and DNA methylation would examine allele-specific methylation The interpretation of results from future studies might be aided by an approach which uses resources such as SCAN to select known eQTLs for tests of association with DNA methylation levels[
51].
The source of the DNA used in our discovery sample represents another possible limitation. The aim of this study was to assess the influence of DNA sequence over DNA methylation in 5 genomic regions. By assessing DNA methylation in the CEPH sample, we had the opportunity to investigate a large number of SNPs at no extra cost to our laboratory. However, comparisons of RNA extracted from transformed lymphoblastoid cell lines to that extracted directly from blood cells have revealed significant differences in gene expression[
62]. Moreover, when DNA methylation in lymphoblastoid cell lines from type 1 diabetes patients was compared with that in paired peripheral blood leucocytes, differences were observed in 8% of the genes assessed[
63]. The SNP associations identified here may therefore not apply to
in vivo DNA methylation levels. Despite this, we nominally replicated the two SNP associations emerging from the analysis of the CEPH sample, in DNA derived from brain tissue. It is also worth noting that Figures
6 and
7 indicate very similar patterns of
DRD4 DNA methylation in the cell line and brain-tissue derived DNA analysed here. Furthermore, a 2010 study of the exact
DRD4 region studied here found similar levels of methylation in DNA extracted from buccal swabs[
42], suggesting that the results of DNA methylation analyses in transformed lymphoblastoid cell lines may be relevant
in vivo.
Future investigations will benefit from an approach similar to that used in Schalkwyk et al.’s 2010 study, which assesses the effects of different alleles within the same individual[
35,
64]. This enables a test of SNP association against a background controlled entirely for all environmental and other genetic factors, and unlike our approach, in heterozygotes at least it can expressly identify DNA methylation differences across the two separate DNA strands. As none of the SNPs identified in our study are known to effect the expression of proximal genes, it is difficult to draw conclusions regarding the effect of the associations with DNA methylation that we have observed. The interpretation of results from future studies might be aided by an approach which uses resources such as SCAN to select known eQTLs for tests of association with DNA methylation levels[
51].
Although the limitations of transformed lymphoblastoid cell line DNA have been discussed above, and though we did not consider disease phenotypes in our analyses, our findings may have implications for research into
DRD4 disease-associations, especially given the nominally significant associations we found in post-mortem brain tissue.
DRD4 has been previously linked to a number of psychiatric and behavioural disorders, most notably ADHD[
2]. Much of the emphasis has been upon the exon 3 VNTR[
3], yet SNPs in the
DRD4 promoter have also shown significant associations with ADHD[
4,
5], schizophrenia[
8,
9] and fibromyalgia[
65]. Interestingly, two SNPs tagged in this study have emerged previously in the literature (see Figure
3); rs936465 is in LD (r2 of 0.96) with rs4331145, which has been implicated in schizophrenia[
9], and rs11246226 (r2 of 0.55), which has been implicated in schizophrenia and fibromyalgia[
9,
65]. DNA methylation has been suggested as a mediator of well-known environmental influences over disease phenotypes such as ADHD[
59,
66,
67]. Our results suggest that epigenetic processes may mediate previously identified, but as yet unexplained, genetic influences too.
Competing interests
The authors declare no competing interests.
Authors’ Contributions
SJD generated DNA methylation data, genotyped the replication sample, ran statistical analyses and drafted the manuscript. OD and CH assisted in the conception of the project and in statistical analyses. RP, JM and UD assisted in the conception of the project and in drafting the manuscript. All authors read and approved the final manuscript.