Background
Autism spectrum disorder (ASD) is an early-onset neurodevelopmental condition with an estimated prevalence of ~1 in 68 [
1]. ASD is expressed across a spectrum of severity in two core phenotypic domains: persistent deficits in social interaction and communication and restricted, repetitive behaviors and interests. ASD is highly heritable, with an estimated heritability of 64–91% based on a recent meta-analysis [
2]. The genetic basis of ASD, however, is complicated by locus heterogeneity for both common allele and rare variant effects. Although common variants in aggregate contribute to a larger proportion (~50%) of liability [
3], genome-wide association studies (GWAS) with thousands of subjects have not found consistent, strongly associated individual common variants [
4‐
9]. Rare de novo variants, including both copy number (CNVs) and single nucleotide variants (SNVs), play a significant role in ASD liability [
10]. To date, dozens of genes harboring de novo CNVs and SNVs meeting genome-wide significance have been identified, and corresponding functional pathways and biological processes have emerged from analysis of these variants [
10]. Despite advances in identifying ASD risk loci, major hurdles remain, since rare variants account for only a minority of cases, and effect sizes for common variants necessitate GWAS sample sizes many times those currently available. Data indicate that a thousand or more genes may contribute to ASD liability [
11].
In addition to larger samples, another strategy to tackle heterogeneity leverages meaningful endophenotypes or biomarkers that demonstrate heritability [
12,
13]. The hypothesis that endophenotypes reflect variation in a subset of the broader set of disease risk genes leads to the notion that the subgroup of cases that share the endophenotype is more genetically homogeneous. Thus, gene discovery in such a subgroup affords greater power compared with a similarly sized group from the overall disease population and in the case of molecular traits may provide a more direct path to functional mechanisms. Biomarkers and endophenotypes in ASD have drawn interest given the potential to facilitate earlier diagnosis and better prediction of prognosis or treatment response [
14].
Hyperserotonemia, or elevated serotonin (5-hydroxytryptamine or 5-HT) in whole blood, is one of the most consistent quantitative traits and biomarkers in ASD since its identification in 1961 [
15‐
18]. In particular, these studies reported a significantly higher 5-HT blood level in about one third of ASD subjects, compared with typically developing controls. The elevated 5-HT level, or hyperserotonemia, is observed in ASD but not in subjects with unspecified intellectual disability [
19]. Whole blood 5-HT levels show intermediate elevation in first-degree relatives of hyperserotonemic probands [
20‐
23]. While hyperserotonemia in ASD shows evidence for heritability, whole blood 5-HT also exhibited high narrow and broad heritability (0.51 and 1.0, respectively) in a Hutterite population sample [
24]. Although the mechanism underlying the elevation of serotonin levels in ASD remains unclear, several lines of investigation point to a role for serotonin in ASD etiology [
25‐
29]. In blood, greater than 99% of the serotonin is stored in platelets, which is taken up from the enterohepatic circulation by the serotonin transporter (SERT), encoded by
SLC6A4, after synthesis in enterochromaffin cells of the gut. Linkage studies in ASD have implicated the 17q11.2 region harboring
SLC6A4 [
30‐
32]. Hypothesizing rare variants in the absence of significant allelic association at
SLC6A4 led to the discovery of multiple functional coding variants [
30], and in particular, the association of
SERT Ala56 was supported in mice by the evidence that mice carrying the variant displayed alterations in social function, communication, and repetitive behavior and elevated whole blood 5-HT [
33]. These findings collectively support hyperserotonemia as a powerful endophenotype for dissecting the genetic etiology of ASD.
In this study, we carried out whole exome sequencing (WES) in a collection of ASD parent-proband trios with 5-HT measurements collected through an Autism Center of Excellence (ACE) study to search for genetic variants implicated in ASD using serotonin as an endophenotype. Given that elevated serotonin was observed in ASD probands compared with their parents, we hypothesize that de novo variants (DNVs) and recessively acting variants (RAVs) play a key role in predisposition of hyperserotonemia and ASD. DNVs observed in probands, but not parents, which disrupt genes involved in 5-HT and ASD, should affect 5-HT levels only in probands; similarly, rare risk alleles acting in a recessive manner (i.e., RAVs) that are transmitted from parents to probands may lead to elevated 5-HT levels in those probands. Both DNVs and RAVs have been implicated in ASD [
34]; however, the allelic architecture of hyperserotonemia in autism is unknown. We thus aimed to utilize this unique endophenotype in ACE trios to identify genes involved in both traits; an approach that we hypothesize effectively reduces genetic heterogeneity. Moreover, we predict that positive findings will shed light on what serotonin-related functional pathways are involved in ASD. Corresponding genes and proteins may offer insights into dysregulated CNS development and point to therapeutic strategies for ASD symptoms.
Discussion
ASD is a genetically heterogeneous disorder with estimates of 1000 or more genes involved in disease etiology. This heterogeneity poses great challenges to identify individually significant risk loci. This challenge is particularly pronounced for DNVs, as mutation rates are extremely low to observe independent de novo mutations in the same gene in a given cohort. LoF DNVs in ASD probands, although rare, are likely to have large effects when predisposing ASD risk, and therefore more likely to identify risk genes. Accordingly, when recurrence of LoF DNM in the same genes is seen in a cohort of probands, it is a strong indicator of that gene’s contribution to ASD risk. In this study, we have a very limited sample size compared to other consortium-level datasets, and unsurprisingly did not observe recurrent/independent DNVs in a gene within our data. Instead, we combined our DNVs with those from previous studies for recurrence analysis. We identified one new recurrent gene,
USP15, as a novel ASD candidate gene, and provided further supporting evidence for two other known recurrent DNV genes (
FOXP1 and
KDM5B).
FOXP1 has been linked to several cognitive disorders, and its deletion causes autism-like behaviors in mice [
72].
KDM5B harbored LoF DNVs in each of two other study cohorts (two in SSC, one in ASC), and probands with
KDM5B LoF DNVs were shown to have lower non-verbal IQ [
64]. We note that de novo LoF DNVs in
KDM5B were also observed in two unaffected (unrelated) siblings in the SSC, suggesting incomplete penetrance. USP15 acts as deubiquitinating enzyme on transforming growth factor-beta (TGF-β) and bone morphogenetic protein (BMP) stimulated R-SMADs (receptor-regulated intracellular proteins that transduce extracellular signals). We note that both TGF-β and BMP signaling are involved in differentiation of serotonergic neurons [
73], but the role of
USP15 in ASD is unclear. With accumulating ASD exome or whole genome sequencing being made public, leveraging previously reported DNVs is an effective strategy for clearly establishing the role of novel risk genes in ASD.
In this study, we implemented several approaches to tackle heterogeneity. First, we separated established ASD genes into network modules that likely represent more homogenous functions. The second was to leverage 5-HT as an endophenotype.
Genetic variants implicated in both hyperserotonemia and ASD are enriched in the subset of probands with hyperserotonemia so that we are equipped with increased power to detect ASD genes that function through regulating serotonin levels. Using this strategy in our data, we were able to identify novel candidate ASD genes not identified in previous large-scale studies that (we imagine) might poorly represent “hyperserotonemic ASD” risk factors. Although the significance is not striking in NGSEA, the non-overlapping LoF and Mis-D2 genes in the High-5HT group show enrichment in the same module (DAWN-1). In contrast, the genes harboring functional DNVs in the Normal-5HT group did not uncover new ASD genes in NGSEA, probably due to that fact that the majority of ASD patients have normal 5-HT, so that genes identified in previous large-scale studies are already more likely to represent the genes identified in the Normal-5HT probands studied here.
In this study we focused on DNVs and RAVs, two mechanisms that we hypothesized are involved in serotonin-related ASD genetic etiology. It is evident that signals due to DNVs are noticeably stronger than RAVs, presumably due in part to the larger effect sizes for DNVs and in part to the need to restrict analysis of RAVs to those from European subjects. For RAVs we observed significant enrichment with the TADA-1 list, which was derived by TADA’s joint modeling of both de novo and inherited variants in a previous study [
11]. We analyzed homozygotes and CHs separately, and it is the CHs, not homozygotes, that showed significant enrichment. For example, all of the RAV genes overlapping with the TADA list are CHs, among which two genes (
ETFB and
RELN) are in the High-5HT group and two, a lysine methyltransferase and a calcium channel gene, are in the Normal-5HT group (
CACNA2D3 and
KMT2C).
We note that the sample size of our study is small given the context of extensive genetic heterogeneity of ASD. For all association and enrichment analyses, we reported nominal p values without correcting for multiple testing. Varying degrees of dependency across tests makes adjustment for multiple comparisons a challenging problem, even when simulations are used to estimate significance empirically. Given the overall enrichment patterns in biologically relevant gene sets and pathways, our analyses provide promising candidates for further validation in large-scale studies.
Acknowledgements
Expert technical assistance was provided by Kathleen Hennessy, Kelley Moore, and Zengping Hao. We would like to thank the individuals with ASD and their family members for their participation.