Background
Epigenetic modifications such as DNA methylation are known to influence gene expression and thus biological function [
1] and alterations in epigenetic marks are found in processes ranging from physiological development and cellular differentiation [
2] to pathological scenarios, as well as aging [
3,
4]. Reflecting the developmental changes of the individual, most epigenetic changes occur during embryogenesis, where waves of extensive epigenetic reprogramming take place [
5]. After birth, epigenetic modification continues to take place throughout the human lifespan, both at the DNA methylation and chromatin levels [
6] and precise DNA methylation markers of age have recently been developed [
7].
It seems, however, that the first years of life constitute the post-natal period with the most substantial epigenetic changes [
8,
9] and, moreover, prenatal and early life epigenetic aggression has been linked to countless health-related consequences in a wide variety of settings [
10]. Thus, the characterization of childhood DNA methylation changes could help uncover genomic locations of functional relevance.
While age-related changes have been functionally linked to many biological processes, other alterations are known to arise in a stochastic fashion [
11,
12]. As such, the use of longitudinal experimental designs can allow for the identification of DNA methylation alterations in a more controlled environment [
9,
13‐
16], and this approach has also been employed to study the influence of clinical parameters on the epigenome, especially during early childhood [
17]. However, longitudinal DNA methylation studies which analyze data from more than two time points remain scarce [
17‐
19] and most of the previous literature stands on now surpassed methylation screening technologies such as the Infinium Human Methylation 450K BeadChip which mainly interrogates CpG-dense genomic regions, even though the functional association between DNA methylation and gene expression is increasingly being described for CpG-sparse locations such as enhancers and gene bodies [
20].
This study, therefore, presents, to the best of our knowledge, the first 3-point longitudinal genome-wide DNA methylation analysis of blood tissue employing the Illumina Infinium MethylationEPIC BeadChip, which allowed us to characterize two distinct phases of early-life epigenetic changes during the first 10 years of life at the same time as focusing on various types of genomic regions, using a clinically well-characterized cohort. Furthermore, we integrated our generated data with external chromatin and enhancer datasets in order to obtain a functional view of the observed age-related DNA methylation changes. Finally, we applied bisulfite pyrosequencing in order to technically and biologically validate the results obtained in the discovery and independent cohorts, respectively.
Discussion
In the present study we examined the genome-wide methylation profile of 33 longitudinal blood samples from 11 children at 0 (newborn), 5 and 10 years of age. Firstly, we found that extensive DNA methylation changes occur during the first 5 years of life, while much less epigenetic remodeling takes place in the following 5 years. As many as 110,726 CpG sites were found to have altered DNA methylation values when comparing newborn and 5 year-old samples, with a slight tendency toward an increase in methylation with age (54% of dmCpGs), while only 460 CpGs exhibited significant methylation changes between 5 and 10 years of age, 72% of which lost methylation. These findings are in line with the current literature which describes early-years epigenetic changes as being the most important of the postnatal period [
8]. What is more, our three time point study design allowed us to directly compare the alterations occurring during two different early-years intervals, demonstrating that after 5 years, epigenetic reshaping is dramatically reduced. In the same vein, Acevedo and colleagues in their reporting of longitudinal DNA methylation changes during the first 5 years of life noted that in the third year these already seem to be weakening [
19]. Aside from being more numerous, we found DNA methylation changes to be far more pronounced in the first lustrum of life than the subsequent one, which may have importance in terms of the functional influence of these methylation alterations on gene expression. Moreover, this observation did not seem to be related to considerable changes in cell-type composition occurring in the first 5 years as compared to the following 5.
We next examined the genomic distribution of the dmCpGs found, which revealed that hyper- and hypomethylation changes occur at very different locations: loss of methylation being observed at CpG-poor regions such as open sea locations, including intronic and intergenic regions, for both 0 → 5 and 5 → 10 changes, while methylation gain occurred at CpG-denser regions like CpG islands and gene promoters for both of the age intervals examined (although 0 → 5 changes were more similar to array distribution). These findings, with exceptions [
19] are in line with most studies describing age-related DNA methylation changes, both in the early years [
8,
16] and later years of aging [
44,
45]. The fact that the great majority of hypomethylation changes were found at open sea regions underscores the value of screening technologies that examine CpG-sparse regions to accurately characterize DNA hypomethylation scenarios.
After identifying the genomic distribution of the age-related methylation changes, we sought to characterize the functional contexts associated with these loci. To achieve this, we integrated our DNA methylation data with external datasets describing histone modifications, chromatin states and enhancers for different tissue and cell types. These analyses primarily considered 0 → 5 dmCpGs because of the low number of 5 → 10 dmCpGs detected, which hindered the enrichments. The results evidenced a link between DNA hypermethylation and the repressive histone mark H3K27me3, and with chromatin states related to polycomb repressive domains and bivalent chromatin enhancers, while DNA hypomethylation occurred mainly at H3K4me1 regions and intragenic enhancers, as has been previously shown [
46,
47]. Furthermore, when mapping the methylation changes to an enhancer library of 65 different tissues, we found that hypermethylation displayed a much lower overall enrichment at enhancer sites than hypomethylation, and while the former occurred mainly at enhancers in normal and embryonic tissue, the latter was particularly enriched in tissues and cell types related to blood. These observations suggest differences in the functionality of the enhancer-associated methylation changes that occur during the first 5 years of life depending on whether there is a gain or loss of methylation. Moreover, as studies are increasingly describing enhancer-associated DNA methylation as the main expression-correlated methylation phenomenon [
20,
48,
49], our results point towards the enhancer-enriched DNA hypomethylation observed perhaps being of more importance in the control of gene expression than DNA hypermethylation. Indeed, enhancer methylation is in general negatively correlated to their activity [
50], and thus it would make sense that the epigenetic changes that occur during the final stages of development are accompanied by the inactivation of general developmental enhancer elements, while tissue-specific (blood) enhancers become activated. In this setting, our data suggest that the early-life DNA methylation loss enriched at tissue-specific enhancer regions observed in our data could reflect the establishment of epigenetic active patterns defining tissue function during this period of life, while enhancers associated to development are already epigenetically repressed and suffer less important (hypermethylation) changes.
Another approach to studying the functional importance of DNA methylation changes is through the mapping of dmCpG sites to genes and then analyzing the related ontologies. Given that DNA methylation is irregularly distributed throughout the different parts of a gene [
1], and because a great proportion of our mapped genes contained both hyper- and hypomethylated dmCpGs, we only performed the ontology analyses, firstly, on genes that exclusively either gained or lost methylation. We found that 0 → 5 DNA hypermethylation changes were principally related to developmental functions and cell-to-cell signaling, while hypomethylation was primarily linked to mRNA and protein metabolism, immune response and mitosis. It is worth mentioning that hypermethylation was also associated with terms such as growth, reproduction and response to retinoic acid, a molecule involved in organogenesis [
51] as well as hematopoiesis [
52]. Significantly, when we considered the ontologies of genes containing both hyper- and hypomethylated probes, we found considerable differences, implying that these genes could play different functional roles to those that are exclusively hyper- or hypomethylated. Moreover, when we examined the ontologies without making these distinctions, we found that, while hypermethylation ontologies did not considerably change, hypomethylation terms shifted from mRNA and protein metabolism to cellular localization and activation or GTPase activity (maintaining immunological terms). These latter findings are more in line with those reported in previous works which did not describe separating exclusively hyper- and hypomethylated genes [
8,
14,
16,
19]. However, the fact that we found a change in function suggests that not taking into consideration the presence of concurrently hyper- and hypomethylated genes could affect the conclusions drawn from gene ontology analyses. On the whole, these results suggest that it is not only differentially hyper- or hypomethylated enhancers that have differing roles, but also differentially hyper- or hypomethylated genes, with the former being more related to general developmental functions, and the latter being involved in, among other things, immunological functions, thus reflecting the functionality differences observed for enhancer elements. In a final analysis, we further segregated the 0 → 5 dmCpGs into groups depending on their gene location (promoter, gene body, intron, exon) and also found changes between the gene ontologies of genes containing promoter- or exon-associated dmCpGs versus intronic- or gene body-associated dmCpGs, both for hyper- and hypomethylated CpGs. These observations imply that different genes, associated to different functions, suffer DNA methylation changes in different regions, which could help explain why DNA methylation changes in different contexts can have different consequences, and also suggests that the response of the gene elements to epigenetic changes during early life is gene region-specific.
Finally, we compared the DNA methylation changes that occur during the first two lustra of life, finding that the majority of the 5 → 10 changes are in fact a continuation of the 0 → 5 changes. By looking at the dmCpGs with the most substantial change in both age groups we defined 36 CpG sites with consistent DNA methylation changes during the first 10 years of life. These locations followed one of two trends: (1) strong 0 → 5 changes followed by weak 5 → 10 changes or (2) moderate overall 0 → 5→10 changes. Many of these CpGs were located at genes with important functions, such as development-associated
HOXB7 [
53], which contained 2 CpGs, and the GATA-interacting
ZFPM2 [
54]. Interestingly, although both these genes are related to developmental functions, the first was found to be hypermethylated while the second was hypomethylated, and, what is more, the dmCpG associated with
ZFPM2 was also mapped to an enhancer element (Fig.
4c). Another zinc finger-family gene which we found to be altered,
ZNF385D, has been associated to language impairment and reading disability in children [
55]. Although this observation could be related to its function during prenatal brain development, the fact that the gene-associated dmCpG (cg02920129) shows a similar trend of methylation gain during both the 0 → 5 and 5 → 10 periods of life (Additional file
12: Figure S4) points towards a possible functional relevance throughout early life. It would thus be of interest to further study the link between the methylation status of
ZNF385D and language impairment or reading disability. The
CD200R1 gene, which encodes for a myeloid- and T-cell distinctive transmembrane receptor [
56] was found to be linked to an hypomethylated dmCpG (cg07061387), which was subsequently found to be associated with enhancer elements related to up to 3 different genes (
ATG3,
NEPRO,
GCSAM) when the 0 → 5 → 10 dmCpGs were mapped to enhancers in different blood tissues, indicating that at least some of the DNA methylation changes found could exert an influence on genes other than the those on which the dmCpGs are located.
Lastly, we performed technical and biological validations of the methylation status of 3 dmCpGs located on the
HOXB7, the
SOCS3 and the
ZFPM2 gene using bisulfite pyrosequencing. As expected, we observed a high correlation between the results of the Infinium MethylationEPIC BeadChip analysis and the pyrosequencing, albeit the DNA methylation values detected by the array were slightly higher, a fact which has been noted before [
57]. Subsequently we examined the methylation of these CpGs in two independent longitudinal cohorts spanning the first and second lustra of life, and found that the DNA methylation changes found in the previous experiments were robust and reproducible in independent subjects.
Authors’ contributions
RFP, PS, JRT, RGU, AFF and MFF carried out the methylation studies and participated in drafting the manuscript. RFP and JRT performed the statistical analyses. PS and RFP performed the pyrosequencing analyses. EL and MFF conceived of the study and participated in its design and coordination of the manuscript. PR performed critical revision of the manuscript. JA participated in collecting and qualifying mothers and children. All authors read and approved the final manuscript.