Background
Rare genetic variants can contribute to the etiology of common neurodevelopmental disorders (NDDs) such as autism spectrum disorder (ASD) [
1‐
7], attention-deficit/hyperactivity disorder (ADHD) [
8,
9], intellectual disability (ID) [
10‐
12], and schizophrenia [
13‐
17]. Marked genetic heterogeneity contributed to the recommendation that genome-wide chromosomal microarray (CMA) be a first-tier genetic test for individuals with selected NDDs [
10,
14]. CMA along with whole-genome sequencing (WGS) technology have allowed increasing detection of genomic alterations such as copy number variations (CNVs) affecting important developmental genes [
10,
18‐
20]. These variants are difficult to adjudicate through parental testing alone, as even established “genomic disorders” can demonstrate highly variable neurodevelopmental and neuropsychiatric expression [
21,
22]. Many are inherited from a putatively unaffected parent, who may have sub-clinical traits or a history of symptoms suggestive of an undiagnosed psychiatric condition [
23]. WGS represents a comprehensive platform for detection of coding and non-coding sequence-level, CNV, structural, and mitochondrial variation [
1,
24‐
26].
There are two main contemporary strategies for discovering NDD genes based on CNVs [
18,
27]. One begins with a unique variant in a proband and involves a traditional family-based study design with deep phenotyping and curation of all genomic variants. The other involves searching for genes or loci that are overrepresented in large datasets of rare genetic variation assembled through clinical testing or research consortia. In this study, we combined these “depth” and “breadth” approaches to characterize a small non-recurrent CNV of uncertain clinical significance. We sequenced the genomes of six individuals from a single multiplex NDD family and propose that a novel 15q21 microdeletion is the likely genetic lesion involved, because of haploinsufficiency of a synaptic scaffolding protein encoded by
DMXL2.
Discussion
We employed WGS and a multifaceted strategy to characterize a CNV of uncertain clinical significance. Diverse and converging lines of evidence suggest that haploinsufficiency of
DMXL2 is a risk factor for NDDs. The available data indicate variable expressivity and possibly incomplete penetrance (or at least age-related penetrance). Families exhibiting an apparently heritable but broad phenotype of NDD symptoms, including members with and without a clinical diagnosis of ASD, can expand our knowledge of the genetically related spectrum of disease. One of the most pressing challenges in the field is to understand the typically high degree of variable neuropsychiatric expression associated with risk variants [
61,
62]. The range of symptomatology observed in this family is reminiscent of what is often observed with genomic disorders and non-recurrent large CNVs, and it should further highlight the impact of rare inherited genetic variants [
18,
22,
63,
64]. In some cases, there is emerging evidence for additional deleterious variants elsewhere in the genome that may act as modifiers [
65,
66]. We also identified de novo and inherited sequence variants of potential relevance in this family, including a rare missense change in
GRIK5. The latter was not as compelling of a candidate variant as the microdeletion overlapping
DMXL2 because of (i) the predicted less-severe nature of the genetic lesion, (ii) the current absence of overt functional or model organism data at the gene level, and (iii) our inability to adequately replicate the finding in the population-scale NDD cohorts. Nonetheless, it could still be a contributor to the risk for NDD in this family. A limitation of our study was that the relatively small nuclear families precluded the identification of an individual with one, but not both, of the variants.
Our findings provide a clinical impetus to now develop and test functional hypotheses regarding disease mechanism(s). The
Dmxl2 gene is best studied in the gonadotropin-releasing hormone neurons [
42,
67], where low expression in mice impedes normal dendritic development [
67]. However, the phenotypes observed in knockout mice are not entirely attributable to deficient
Dmxl2 in that neuronal cell population [
67].
Dmxl2+/− mice also demonstrate neuroanatomical differences in the corpus callosum [
68]. At least four of ten experimentally proven protein interactors of DMXL2 [
69] are encoded by established or candidate genes for NDDs:
CYFIP2 [
70,
71],
DYNC1H1 [
72,
73],
MATR3 [
74], and
NCKAP1 [
5,
75,
76]. Both CYFIP2 and NCKAP1 shape the formation of dendritic spines via the WAVE actin-remodeling complex [
77,
78]. DMXL2 protein dosage insufficiency may therefore result in structural (e.g., abnormal dendritic spines morphology) and/or functional (e.g., impaired transmission) synaptic consequences in humans, as have been observed in other genetic forms of ID, ASD, and schizophrenia [
79,
80].
It remains difficult in clinical practice to interpret novel inherited CNVs, which are often labeled variants of uncertain clinical significance [
49]. Incomplete penetrance and epistasis are likely underappreciated. Most approaches to CNV adjudication focus on the overlapped genes and disregard the remainder of the genome [
81], in part because CMA is incapable of identifying smaller and balanced genetic variants. Whereas WGS is more comprehensive in detection [
1,
24‐
26], sometimes also revealing complicating data for clinical interpretation, as was the case with the
GRIK5 variant. Updated guidelines are required for interpreting CNVs that take into account sequence variants [
81]. Specific consideration will need to be given with respect to how to determine if two or more variants are major contributors to disease within a specific individual. The risk profile from common variants with a polygenic effect may also become clinically relevant in time, even in those individuals with a monogenic diagnosis or genomic disorder [
82,
83].
Variant inheritance patterns in multigenerational pedigrees can be highly informative, with the caveat that customary assumptions (e.g., de novo suggests pathogenic, and inherited from putatively unaffected parent suggests benign) are imperfect [
84‐
86]. The availability of samples and detailed phenotype data from three generations of a family, including offspring and parents of an individual with ASD or other NDD phenotypes, is uncommon [
23]. There is substantial evidence that different rare disease-causing variants can segregate within the same family [
1,
3]. Our approach here of examining multiple rare penetrant variants across generations might reveal new genes for NDD. This may especially be the case for genes involved in higher functioning forms of ASD, perhaps including
DMXL2. Our study design also allowed for the observation that the proband’s de novo variant in
ZC3H14 was not inherited by her daughter with NDD. In contrast, both the 15q21 deletion and the
GRIK5 variant did segregate with the apparent NDD phenotypes, albeit in a simple pedigree where the a priori probability was
p = 0.125 for any single variant [
87].
Acknowledgements
The authors thank the family for their participation, and staff and trainees at The Centre for Applied Genomics. The authors wish to acknowledge the resources of MSSNG (www.mss.ng), Autism Speaks and The Centre for Applied Genomics at The Hospital for Sick Children, Toronto, Canada. They thank the participating families for their time and contributions to this database, as well as the generosity of the donors who supported this program. This study also makes use of data generated by the DECIPHER community. A full list of centres who contributed to the generation of the data is available from
http://decipher.sanger.ac.uk and via email from
decipher@sanger.ac.uk. Funding for the project was provided by the Wellcome Trust.
Additional acknowledgements regarding the control dataset are as follows:
(1) Funding support for the Study of Addiction: Genetics and Environment (SAGE) was provided through the NIH Genes, Environment and Health Initiative [GEI] (U01 HG004422). SAGE is one of the genome-wide association studies funded as part of the Gene Environment Association Studies (GENEVA) under GEI. Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by the GENEVA Coordinating Center (U01 HG004446). Assistance with data cleaning was provided by the National Center for Biotechnology Information. Support for collection of datasets and samples was provided by the Collaborative Study on the Genetics of Alcoholism (COGA; U10 AA008401), the Collaborative Genetic Study of Nicotine Dependence (COGEND; P01 CA089392), and the Family Study of Cocaine Dependence (FSCD; R01 DA013423). Funding support for genotyping, which was performed at the Johns Hopkins University Center for Inherited Disease Research, was provided by the NIH GEI (U01HG004438), the National Institute on Alcohol Abuse and Alcoholism, the National Institute on Drug Abuse, and the NIH contract “High throughput genotyping for studying the genetic contributions to human disease” (HHSN268200782096C). The datasets used for the analyses described in this manuscript were obtained from dbGaP at
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000092.v1.p1 through dbGaP accession number phs000092.v1.p1.
(2) The authors acknowledge the contribution of data from Genetic Architecture of Smoking and Smoking Cessation accessed through dbGaP. Funding support for genotyping, which was performed at the Center for Inherited Disease Research (CIDR), was provided by 1 X01 HG005274-01. CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268200782096C. Assistance with genotype cleaning, as well as with general study coordination, was provided by the Gene Environment Association Studies (GENEVA) Coordinating Center (U01 HG004446). Funding support for collection of datasets and samples was provided by the Collaborative Genetic Study of Nicotine Dependence (COGEND; P01 CA089392) and the University of Wisconsin Transdisciplinary Tobacco Use Research Center (P50 DA019706, P50 CA084724). The datasets used for the analyses described in this manuscript were obtained from dbGaP at
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000404.v1.p1 through dbGaP accession number phs000404.v1.p1.
(3) A dataset used for the analyses described in this manuscript were obtained from the NEI Refractive Error Collaboration (NEIREC). Funding support for NEIREC was provided by the National Eye Institute. We would like to thank NEIREC participants and the NEIREC Research Group for their valuable contribution to this research. The datasets used for the analyses described in this manuscript were obtained from dbGaP at
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000303.v1.p1 through dbGaP accession number phs000303.v1.p1.
(4) Funding support for the “CIDR Visceral Adiposity Study” was provided through the Division of Aging Biology and the Division of Geriatrics and Clinical Gerontology, NIA. The CIDR Visceral Adiposity Study includes a genome-wide association study funded as part of the Division of Aging Biology and the Division of Geriatrics and Clinical Gerontology, NIA. Assistance with phenotype harmonization and genotype cleaning, as well as with general study coordination, was provided by Heath ABC Study Investigators. The datasets used for the analyses described in this manuscript were obtained from dbGaP at
https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000169.v1.p1 through dbGaP accession number phs000169.v1.p1.