The hypothesis that a genetic mutation present in only a proportion of the neuronal cells can cause a neurological disease has been formulated quite early in the history of AOND genetics [
139,
180]. To date, most studies still focus on the analysis of germline mutations present in all cells by studying DNA isolated from a large proportion of blood cells. However, recent improvements in sequencing technologies have enabled the accurate identification of post-zygotic including late-somatic mutations present in subsets of cells or even in single cells [
3]. In different neurodevelopmental disorders, evidence has been provided that post-zygotic or late-somatic mutations can cause disease using a combination of technologies on different tissues [
70,
141]. Some of these mutations were detected in blood samples, indicating that they occurred early during development. One can assume that post-zygotic mutations, if detected in multiple tissues or with high allelic ratios in blood, might be present in a significant proportion of brain cells. Hence, the level of causality between germline and post-zygotic functional variants should be comparable. It is much more complex to detect brain-specific somatic mutations. Autopsy of certain AOND cases sometimes reveal widespread neuropathological lesions, which would be more in line with germline causes of disease. A focal onset of disease, as seen in some types of primary progressive aphasia in the FTLD spectrum, on the other hand, suggests a role for late-somatic mutations. Overall, in neurodegenerative disorders eventually affecting a large part of the brain, one could assume that the causative mutations must be present in a high proportion of brain cells (neurons and/or glial cells). However, most of the AOND share mechanisms called seeding and spreading. These features, also referred to as Prion-like properties, are conferred by proteins that can transfer their pathogenic state into wild type, normally folded proteins (seeding) and then spread into the whole brain following neuronal connections (for review, see [
126]. This phenomenon has been studied first for the Prion protein itself. However, such properties are now being characterized for Tau, Aβ, TDP-43, α-synuclein, or even the Huntingtin protein even if they are not associated with a spontaneous infectious propensity. Similar to an external focal injection of pathologic proteins in animal models, one can hypothesize that a small proportion of cells carrying a somatic variant resulting in the production of a pathogenic misfolded protein could be the source of a cerebral neurodegenerative disorder. Low-level mosaics should therefore also be considered.
Lessons from control brains and clues for the interpretation of somatic mutations in AOND
Recently, novel sequencing approaches provided critical knowledge on post-zygotic variation in healthy control tissues. The human genome is clearly not stable throughout life and post-zygotic variants may occur in any cell at any time (Fig.
1b). Replicating cells are particularly prone to somatic mutations, with highly replicating tissues such as the skin or hematopoietic tissue showing the highest somatic mutation burdens. External factors may favor the occurrence of mutations during the replication phase of the DNA, including mutagenic agents such as radiation or toxic agents. Aging seems to be the strongest risk factor influencing the accumulation of somatic mutations during clonal hematopoiesis [
2]. We summarize hereafter the main points that we consider of high importance for the analysis and interpretation of somatic variants, following the study of normal and diseased brains.
1.
Post-mitotic neurons exhibit an unexpectedly high burden of post-zygotic mutations. The use of single-cell genomic approaches including WGS in neurons from non-diseased brains unveiled an unsuspected amount of post-zygotic mutations, including about 1500 somatic SNV per neuron [
98]. Of them, some occurred during fetal development [
15,
75], but a higher burden was detected in post-mortem adults [
98]. Although the number of single neurons that have been sequenced remains limited (dozens or hundreds), the elevated rate of post-zygotic SNV was quite unexpected as neurons are post-mitotic lifelong cells and hence are not subjected to errors during DNA replication, beyond the ones putatively acquired during the divisions of progenitors. Single neuron somatic SNV were mostly associated with transcriptional activity, i.e., neuronal activity, contrary to the variants shared by multiple cells in the brain or tissues with a high replication rate [
98]. In addition, neurons may accumulate late-somatic mutations during aging [
97]. Whatever the associated mechanisms, some of these events could result in the production of an abnormal/misfolded protein, which could represent a source of seeding and spreading in the brain, causing a neurodegenerative disease.
2.
Bulk brain tissue is a combination of replicating and non-replicating, post-mitotic cells; sequencing genomic DNA isolated from bulk brain tissue does not allow the distinction between the cell types [
64]. The identification of very low-level mosaics from bulk brain tissue does not imply that different cell types carry the mutation of interest and that the mutated genes are expressed in the mutated cells, hence leading to the production of an abnormal protein.
3.
Although the access to brain tissue is mandatory to assess the somatic variant hypothesis thoroughly, studying other tissues may be worth of interest. It has been shown that mutations present in more than 5–10% of the brain cells were generally also detected outside the brain, in tissues derived from all three embryonic layers, including the ectoderm, suggesting that these mutations occurred during early phases of embryonic development [
98]. Although this still requires replication, this is a strong argument for assessing other tissues, including ectodermal tissues and blood, when cerebral tissue is not available. The fact that some neuronal cells shared more common cellular ancestors with cells from other organs than the brain in one individual was also a surprising but promising finding for deep-sequencing studies performed on other tissues than the brain in AOND. However, every study performed from non-CNS tissue will be facing the non-representativeness of the allelic ratios eventually identified, and the lack of evidence that a putatively causal mutation with a low allelic ratio is really present in the affected neuronal cells. Importantly, recent results of single neuron WGS also raised the question of thresholds: if a putatively pathogenic mutation is found in a brain with an AOND, how many neurons should carry it, among the tens of billions of neurons in the brain [
14], to be significant enough to cause a widespread neurodegenerative disease?
4.
Data from mouse models suggest that seeds from peripheral tissues like blood [
28] or intestine [
82] can spread into the brain and cause a neurodegenerative disorder. The presence of a pathogenic mutation in the brain cells would hence not be mandatory to cause such a disease, although one can assume that a significant amount of pathological seeds should be produced to trigger an AOND.
5.
The study of somatic aneuploidy and CNVs still requires technological improvements. Aneuploidy has been studied in AD brains for decades (for review see [
9,
131,
145,
146], mainly thanks to slice-based cytometry and fluorescent in situ hybridization (e.g., [
10,
106,
146]). The fraction of neurons containing extra chromosomes has been reported to be higher in AD brains than in controls [
106]. Controversial results on aneuploidy rates ranging from 1% or less to more than 50% percent have been reported, as recently reviewed together with other inconsistencies (see [
145,
146]). After the introduction of single cell NGS, the fraction of aneuploid neurons has been reevaluated to be from 0 to 3% [
30,
80,
103,
172] and the rate of neurons carrying post-zygotic CNVs has been evaluated to several dozen percents. Given the challenging assessment of germline CNVs using NGS in general, the interpretation of CNVs from single cell WGS may also require caution and improved techniques are required as proposed recently [
144]. In addition, it has been shown that DNA isolation protocols may significantly influence CNV detection [
109]. Similar to CNVs, a high burden of post-zygotic L1 insertions that can disrupt or deregulate the expression of genes has been reported, although there are differences of opinions on the numbers of post-zygotic L1 insertions in single neurons [
50,
171]; some results may require technical validation.
6.
The study of brains from AOND cases implies the analysis of patients with advanced disease, which can be associated with secondary DNA damage. Increased aneuploidy rates and/or of DNA content in AD neuronal cells could be related to errors during mitosis of neuronal progenitors. More likely, though, it can be caused by a reentry in the cell cycle as the result of AD pathophysiological processes. This has been suggested by the identification of neurons with 4n DNA content and a positive staining for Cyclin B1 [
106]. It is likely that, in the context of advanced neurotoxicity, such observation could basically be a simple consequence of neurodegeneration—nonspecific to a given AOND instead of a causative mechanism [
9]. In theory, the study of so-called preclinical AD brains could help tackle this issue. However, a proportion of neurons in preclinical AD brains might already be at a final stage of the pathophysiology. Studies of presymptomatic patients carrying
APP,
PSEN1, or
PSEN2 mutations showed evidence of neuronal damages years before the first clinical signs, similar to what has been observed in animal models [
21,
79,
131]. Even if a few studies had access to samples of patients with preclinical or early-stage AD, most research is performed on end-stage AD as it has been the main condition leading to the patient’s death. While many neurons already died, many other neurons have undergone stress and toxicity during years, before the first symptoms appeared. Some of them have accumulated DNA damage, including somatic SNVs [
97]. Oxidative stress and microtubule dysfunction in AD neurons are several of the causes leading to secondary damages in the DNA. In FTLD and ALS, it has been shown that secondary DNA damage can participate in the pathophysiological processes in
C9ORF72 and
FUS mutation carriers [
51,
110]. Interpreting genetic results from AOND brains should therefore be done with great caution. Of note, similar caution should be provided for the interpretation of somatic mitochondrial DNA mutations which face the same issue of possible secondary mutations induced by neurodegeneration. In addition, caution is required when interpreting negative findings in sequencing studies performed on brain tissue. Indeed, the mutated cells may also be the more vulnerable ones and hence mutations may be undetectable because of cellular death.
Taken together, somatic mutations in the brain may result from (1) early embryonic events, (2) mutations occurring in neuronal progenitors during neurodevelopment as a result of replication errors, (3) mutations in replicating cells in the brain at any stage of life as a result of replication errors, (4) mutations in post-mitotic neurons as a result of transcriptional activity and (5) as a result of DNA damage in the context of cellular stress. Although advances in genomic and single cell technologies have provided novel information and promising hypotheses, the interpretation of sequencing data obtained from brains with AOND will be an even bigger challenge than the technology itself in the near future.
Somatic mutations in patients with AOND
Given the role of germline duplications of
APP and
SNCA, respectively, in autosomal-dominant EOAD and Parkinson’s disease, CNV studies have focused on these loci as well as chromosomal abnormalities. There is still debate on whether AD brains are enriched in neurons carrying extra copies of chromosome 21 containing the
APP gene. Interestingly, Bushman et al. [
29] recently reported increased copy numbers of the
APP locus itself in AD brains. So far, however, this exciting result has not been replicated. Even more recently, the study of nigral dopaminergic neurons—the neurodegeneration of which causes Parkinson’s disease—revealed an average proportion of dopaminergic neurons with gains of
SNCA copies in each nigra of 0.78% in Parkinson disease patients versus 0.45% in controls [
105]. Such enrichment was not found in non-dopaminergic neurons. Overall, among the 40 patients, 31 (77.5%) had at least one dopaminergic neuron showing a gain of
SNCA copies as compared to 10/25 (40%) of the controls. These results suggest that late-somatic copy number gain of
SNCA is not a rare event and suggest that a significant enrichment should be required to trigger the disease. The fact that picomolar concentrations of
SNCA oligomers can induce disease-related pathways in cells in vitro seems to be in paradox with the latter study [
68]. If replicated, these results obtained on nigral neurons would question the hypothesis that low-level mosaics alone would be sufficient to trigger diffuse neurodegenerative disease in vivo. Other brain regions may also be studied as well as other mutations and their putative functional consequences.
In addition to CNVs affecting known disease genes, post-zygotic CNVs may also affect novel Parkinson’s disease genes. In phenotypically discordant monozygotic twin pairs with one of the twins exhibiting Parkinson’s disease, a few post-zygotic novel CNVs have been identified in the affected twins [
27]. Further research is needed to confidently link these genes to Parkinson’s disease.
The presence of brain-specific single nucleotide mutations has been assessed in 1988—before the identification of the first causative genes of autosomal-dominant EOAD. In an exploratory study, the sequence encoding the Aβ peptide was analyzed in cDNA isolated from three brains with sporadic AD but no mutation was found [
180]. The hypothesis that post-zygotic mutations could explain sporadic AD was reassessed after identification of
APP,
PSEN1, and
PSEN2 germline pathogenic mutations in autosomal-dominant families. The analysis of DNA isolated from bulk brain pieces of 99 patients with sporadic AD revealed a
PSEN1 mutation that was eventually confirmed to be present in the germline [
139]. This hypothesis was also assessed in Parkinson’s disease and ALS before the era of NGS, with negative results [
121,
135]. Recently, WES was performed in hundreds of brains from patients with different types of AOND. While pathogenic mutations were detected in autosomal-dominant genes, parental DNA was not available for testing and the average depth of sequence coverage did not allow for the detection of low-level somatic mutations [
76].
The use of deep-sequencing or blood–brain duo strategies has been applied only recently in sporadic AD. In a first study, a targeted deep-sequencing approach was used to analyze the genomic loci of
APP,
PSEN1,
PSEN2 and
MAPT in DNA isolated from the entorhinal cortex of 72 patients with sporadic AD and 58 controls [
152]. Custom capture and deep sequencing of the genomic regions of these four genes revealed 107 candidate post-zygotic mutations but only 3 could be confirmed by amplicon-based deep sequencing: two novel
MAPT missense mutations of unknown significance in sporadic AD patients (variant allele frequencies of 1.0% and 1.1%) and one known
PSEN2 likely benign missense mutation (variant allele frequency of 1.6–5.7%) in a control. Of note, among the 41 patients with an available age at onset, the median age of onset was 78 years (range: 46–92) and among the other 31 other patients, the median age at death was 79 years (range 57–96, the youngest carried a pathogenic
PSEN1 germline variant p.H163R), suggesting that majority had a late disease onset. Another recent study focused on more technical aspects in the context of AD, but did not provide results directly relevant to AD itself [
57]. In a study including 17 sporadic AD patients, 2 controls, and 2 patients with vascular dementia, WES (mean depth of coverage: 60.8x) was performed on DNA isolated from blood as well as the hippocampus [
122]. This strategy did not allow the identification of low-level mosaics and no putatively pathogenic brain-specific mutation was identified. The average age at death was 86.8 years (range 73–94), suggesting that most of them, if not all, presented a late onset of AD.
With the hypothesis that, similar to germline DNM, post-zygotic, including late-somatic mutations causing sporadic AOND may be associated with early-onset forms, we recently performed a targeted deep-sequencing screen of 11 genes in 445 sporadic AD patients (355 blood samples, 100 brain samples), > 80% of which had an early onset [
112]. We used single molecule Molecular Inversion Probes (smMIPs) capture followed by deep sequencing and validation with independent ultra-deep sequencing, allowing for very high sensitivity and specificity. We identified nine post-zygotic mutations with allelic ratios ranging from 0.2% to 10.8%. Two of these mutations were predicted to alter the function of SORL1, which is currently considered as a strong risk factor for EOAD. However, no predicted pathogenic post-zygotic mutations in known autosomal-dominant genes could be identified in this large sample.
Even more recently, 102 genes were screened by targeted capture followed by deep sequencing in 173 samples from 54 human brains [
75]. Post-zygotic variants were validated by a technology including the use of UMI. Of them, 20 individuals presented with AD and 20 exhibited Parkinson’s disease or dementia with Lewy bodies. Despite the detection of 62 post-zygotic variants, no putatively pathogenic variant was identified.
The preliminary results of the above-mentioned studies do not immediately point to a significant role for post-zygotic mutations in known disease genes sporadic AOND. This may change with improvements in capture as well as sequencing technologies (including the analysis of single cells which is only just starting and is a tremendously promising field) as well as the analysis of many more brain samples, especially from patients with an early onset of disease and the more accurate detection of mosaic structural variations. Taking together all the positive and negative results obtained from control and diseased brains, the assessment of the somatic variant hypothesis in AOND has opened many novel questions. Among them, there is discussion about the minimum amount of pathological seeds, the timing of occurrence, and the regions where these seeds should appear to be sufficient to trigger neurodegenerative diseases. Experiments in animal models may help researchers to answer some of these questions, combined with the application of ultra-sensitive sequencing of multiple brain regions in AOND patients and controls.