Introduction

Autism spectrum disorders (ASDs) are a group of early-onset complex pervasive developmental disorders (PDDs) characterized by impairments in communication, social interaction, and creative or imaginative play. They are associated with several co-morbidities including intellectual disability (ID; IQ<70; 65% of cases),1 epilepsy (30% of cases),2 and physical anomalies representing syndromic or complex forms of ASD (20% of cases).3 Complex ASDs can be the result of microdeletions or microduplications of contiguous genes for almost every chromosome.4 Recurrent changes present the best opportunity for mapping autism susceptibility loci. For example, approximately 1–4% of individuals with autism have a duplication of chromosome 15q11–q13.4, 5, 6 In other cases, both chromosome rearrangements and point mutations in one of the genes in the rearrangement lead to the same neurological disorder. For example, Smith–Magenis syndrome, a disorder associated with autism,7 is caused by a 3.7 Mb deletion in chromosome 17p11.2 in 90% of cases, but can also arise through mutations in the retinoic acid-induced 1 gene (RAI1; OMIM: 607642), located within the commonly deleted region.8

Our research aims to identify novel candidate gene regions for ASDs and determine which of the genes located within those regions contribute to autistic symptoms. We recently described two unrelated individuals with overlapping de novo interstitial deletions at 2p15–p16.1.9 Both presented with autism, moderate to severe ID, microcephaly, cortical migration defects, renal anomalies, digital camptodactyly, visual and communication deficits, and a distinctive pattern of craniofacial features. Four more subjects were recently reported to have a microdeletion in the 2p15–p16.1 region,10, 11, 12, 13 and similar phenotypic features; one of them had documented autism.13

We propose that any chromosomal region that has two or more cases with rare copy number variants (CNVs) that are found in association with ASD in at least a portion of cases, and not in controls, harbors a gene or genes that, when function is compromised, lead to autistic behaviors. Our approach is to screen for additional subjects with similar deletions or duplications and then select the best candidate gene(s) in the region, and test for association in large cohorts of individuals with ASDs. In this study, we used real-time quantitative PCR (qPCR) to refine the deletion breakpoints and screen a large number of individuals for additional deletion cases. We chose two candidate genes from the region to test for association in a large cohort of ASD families. The exportin 1 gene (XPO1) maps to the minimal critical deletion region and encodes the karyopherin exportin 1, also known as the chromosome region maintenance 1, or CRM1 protein. CRM1 is involved in nuclear export signal (NES) -dependent nuclear export and has been implicated in mitosis.14 It is primarily involved in the export of ribosomal RNAs, spliceosomal U small nuclear RNAs (U snRNAs), signal recognition particles (SRPs) and several specific mRNAs.14 The second gene is the nearby orthodenticle homolog 1 gene (OTX1), a homeobox gene. Homeobox genes are transcription factors that are expressed during specific time-periods during ontogeny. OTX1 is strongly expressed in the proliferative layers of the neocortex in the forebrain15 and the sense organs.16 Although it is only deleted in one of the individuals with the 2p15–p16.1 deletion, the phenotype fits with the roles of OTX1, and thus, there could be a position effect on the expression of this gene because of the nearby breakpoints. Our association findings point to both XPO1 and OTX1 as contributing to susceptibility to autistic behaviors.

Materials and methods

Subjects with 2p15–p16.1 deletions

The probands, Subjects 1 and 2, are from simplex families, originally identified and described by Rajcan-Separovic et al.9

Subjects screened for 2p15–p16.1 deletion

A total of 798 individuals with ASD from 509 families were screened for additional cases of the 2p15–p16.1 deletion, using microarray comparative genomic hybridization (array CGH) and real-time qPCR. Of 265 multiplex (MPX) families, 37 were reported by Robinson et al17 and 139 were obtained through the Autism Genetic Resource Exchange (AGRE, Los Angeles, CA, USA).18 The remaining 89 MPX families and 244 simplex (SPX) families were recruited by the Autism Spectrum Disorders – Canadian American Research Consortium (ASD–CARC: www.asdcarc.com; www.autismresearch.com). Data from the Autism Diagnostic Interview – Revised (ADI–R)19 and/or PDD Behavior Inventory (PDDBI), which has been shown to have excellent reliability with the ADI–R,20, 21 was available for all cases. All individuals met the clinical diagnostic criteria for an ASD.

ASD family cohorts and comparison group for candidate gene association studies

Four independently recruited cohorts of ASD families were included in the association studies. The ASD–CARC cohort consists of 164 MPX families and 238 SPX families, mainly from Canada, with some families from the United States. The New York cohort was recruited by a research team at New York State Institute for Basic Research in Developmental Disabilities (IL Cohen), and consists of 13 MPX families and 66 SPX families.22 The AGRE samples were purchased from AGRE and were collected from 139 MPX families. The SIRFA (Società Italiana per la Ricerca e la Formazione sull’Autismo) cohort consists of 16 MPX and 294 SPX families recruited in Italy (AM Persico). All affected individuals met the clinical diagnostic criteria for an ASD. For the population-based association studies of XPO1 and OTX1, cases from North American cohorts were compared with a comparison group, which consisted of bloodspots from 760 anonymous neonates originally obtained for phenylketonuria testing by the Ontario Ministry of Health, following ethics approval at Queen's University (Kingston, Ontario, Canada). Although information regarding psychiatric and behavioral disorders was not available for the comparison group, the prevalence of ASDs is unlikely to be greater than that in the general population.

Ethics approval for research involving human subjects was obtained through the Health Sciences Research Ethics Board of Queen's University (Kingston, Ontario, Canada), Clinical Research Ethics Board of the University of British Columbia (Vancouver, British Columbia, Canada), the Health Research Ethics Board at the Faculty of Medicine, University of Manitoba (Winnipeg, Manitoba, Canada), the Institutional Review Board of the Institute for Basic Research (Staten Island, NY, USA), and University Campus Bio-Medico (Rome, Italy). Written informed consent was obtained from all participating family members or by AGRE.18

Real-time quantitative PCR

Real-time qPCR was used to determine copy numbers of markers within the 2p15–p16.1 region, both for breakpoint refinement and screening of additional cases of deletions in 798 individuals with ASDs. Primer design and real-time qPCR methods, using SybrGreen (Applied Biosystems, Foster City, CA, USA), were carried out as previously described.23 Primer sequences for screening are presented in Supplementary Table 1. DNA samples from Subjects 1 and 2 were used as positive controls.

Array CGH

Of 798 individuals with ASDs, 115 were also examined using 1 Mb resolution array CGH, and 40 were tested using the 105K oligo array (Agilent, Santa Clara, CA, USA), performed as previously described.24, 25

Tag SNP selection and genotyping

Eight Tag SNPs for XPO1 and three Tag SNPs in OTX1 and its flanking region (Supplementary Table 2) were selected based on Hapmap information, using the Applied Biosystems’ SNP Browser (www.appliedbiosystems.com) and criteria as previously described.26 Genotyping of the Tag SNPs was carried out using validated TaqMan SNP assays (Applied Biosystems) as previously described.26 Duplicate samples and negative controls were included in each plate to check the accuracy of genotyping. Genotypes were automatically scored with the SDS 2.2.2 software (Applied Biosystems) using standard parameters.

Statistical analyses

Genotypes were checked for Mendelian errors with the family-based association tests (FBAT) program (v2.0.3).27 Families with Mendelian inconsistencies were either corrected following re-genotyping or omitted from data analyses. Before association analyses, each SNP was tested for deviation from Hardy–Weinberg equilibrium in the subjects, their parents, and the controls, using the HWE program.28 FBAT, including single-marker FBAT, quantitative transmission disequilibrium testing (QTDT), and haplotype TDT were carried out with FBAT (v2.0.3).27 Both single-marker and haplotype testing for affection status and quantitative ADI–R data were carried out using an additive model. Haplotype testing was performed under the ‘biallelic’ mode. Allele and genotype frequencies in affected individuals were compared with those of the comparison group, using χ2-statistics with SPSS v14.0 (Chicago, IL, USA). Haplotype frequency predictions for controls, as well as the affected individuals and parents, were calculated using the HelixTree program (www.GoldenHelix.com). One affected child per family was randomly chosen for case-control studies.

Corrections for multiple comparisons

To account for multiple testing in genetic association studies, we used the Benjamini and Hochberg false discovery rate (FDR) method,29 which has been shown to be appropriate for candidate gene association studies using multiple SNPs. FDR corrections were performed separately for single-marker and haplotype FBAT, using affection status as a qualitative trait, and QTDT using ADI–R scores as quantitative traits. FDR corrected P-values (PFDR) that were <0.050 were considered significant.

In silico analysis

The Human Genome Segmental Duplication Database (http://projects.tcag.ca/humandup/) was used to examine the regions around the 2p15–p16.1 deletion breakpoints for the presence of segmental duplications, and RepeatMasker (http://repeatmasker.org/) was used to identify and locate interspersed repeats and low complexity DNA sequences. Breakpoint regions were also compared with each other for similarities, using the online version of William Pearson's LALIGN program (http://www.ch.embnet.org/software/LALIGN_form.html).

Results

Refinement of the deletion breakpoints in Subjects 1 and 2

The minimal proximal and distal breakpoints in Subject 1 were located at about 62 885 kb and 56 773 kb, based on human build 37.1 (Bethesda, MD, USA) (Figure 1). The deletion is >6.1 Mb and contains 38 genes. In Subject 2, the minimal proximal breakpoint was located at 63 372 kb, and the minimal distal breakpoint at 55 481 kb, making the deletion >7.9 Mb in size and spanning 52 genes.

Figure 1
figure 1

Breakpoints of the 2p15–p16.1 deletions and locations of candidate genes in the region.

Screening for additional cases of 2p15–p16.1 deletions

We screened samples from 798 individuals with an ASD, using real-time qPCR at six non-polymorphic markers (Supplementary Table 1) for the presence of similar chromosome abnormalities. Of these, 155 subjects with complex ASD were also tested using array CGH. No additional cases of 2p15–p16.1 deletions or duplications were found.

In silico analysis of breakpoint regions

In silico analysis using public databases revealed the absence of flanking LCR sequences, but several segments near the breakpoints were recognized by Repeatmasker as being fragments of known interspersed or simple repeats, including SINEs and LINEs (short and long interspersed elements, respectively), long terminal repeats (LTRs) of retrovirus-like sequences, and DNA transposable elements. We examined these for stretches with greater than 60% sequence homology and a presence at both the proximal and distal breakpoints. Although none were found for Subject 1, a 2960 bp LINE-1 repeat with 87.9% identity was found at positions 63 386 kb and 55 575 kb for Subject 2's deletion.

Association of autism with XPO1 and OTX1

XPO1 gene

Although no association with ASD susceptibility was found for the seven common XPO1 SNPs (rs4430924, rs11125883, rs766448, rs766447, rs7563678, rs3732171, and rs1562308) in family-based tests and all eight SNPs in case-control comparisons (data not shown), the less common rs6735330 SNP was shown to be significantly associated with autism susceptibility in the family-based tests in all four cohorts (P<0.05), with the association being highly significant in ASD–CARC cohorts (PFDR=1.29 × 10−5), the AGRE cohort (PFDR=0.0011), and the combined family sets (PFDR=2.34 × 10−9) (Table 1). QTDT demonstrated an association of the rare rs6735330 G allele with higher ADI–R subdomain scores in the affected children, or more severe deficits in social interaction, verbal communication, and repetitive behaviors (PFDR=3.01 × 10−7, 1.48 × 10−6, and 3.01 × 10−7, respectively) (Table 3).

Table 1 FBAT results for SNP rs6735330 in XPO1, and SNPs rs2018650 and rs13000344 in OTX1 in ASD families

OTX1 gene

The allele, genotype, and haplotype frequencies in cases and controls did not differ significantly for any of the markers tested (data not shown).

No association with autism was found for the common rs2073575 SNP in OTX1; single-marker FBAT revealed that the more common rs2018650 and rs13000344 OTX1 alleles were significantly associated with autism in ASD–CARC cohorts (PFDR=8.65 × 10−7 and 6.07 × 10−5, respectively), the AGRE cohort (PFDR=0.0034 and 0.015, respectively), and the combined family sets (PFDR=2.34 × 10−9 and 0.00017, respectively). Associations were marginal or not significant in the New York and SIRFA cohorts, although preferential allelic transmissions were remarkably consistent among all cohorts (Table 1). Significant association (PFDR=2.63 × 10−11) with ASD was found in the haplotype transmission test (Table 2).

Table 2 Haplotype FBAT for SNPs rs2018650 and rs13000344 in OTX1 in ASD families

The two common alleles, rs2018650C and rs13000344G, either individually or in combination, were associated with higher ADI–R domain scores (more severe problems) for social interaction, nonverbal communication, and stereotyped behaviors (Tables 3 and 4). There was insufficient data to test for association with verbal communication.

Table 3 QTDT results for SNP rs6735330 in XPO 1, and SNPs rs2018650 and rs13000344 in OTX1 under an additive model
Table 4 Haplotype QTDT of the two rare SNPs, rs2018650 and rs13000344 in OTX1

Discussion

We have described the molecular characterization of two 2p15–p16.1 deletions identified in unrelated individuals with AD.9

In silico analysis using public databases revealed 108 repetitive elements, including SINEs and LINEs, in the regions of the breakpoints. A comparison of the repeats at the ends of each deletion revealed no regions of sequence homology equal to or greater than 60% for any stretch of sequence longer than 200 bp in Subject 1's deletion. For Subject 2's deletion, there was a 2960 nucleotide region, belonging to the LINE-1 repeat family with 87.9% homology at the proximal and distal breakpoints. LINE-1 retrotransposons have been implicated in genomic rearrangements, as they are significantly enriched at the breakpoints of CNVs compared with the genomic background.30 We suggest that the LINE-1 elements present at both breakpoints of the deletion in Subject 2 predispose to genomic instability through NAHR, and that this mode of deletion generation raises the possibility of recurrence of deletions in this region.

As our initial report,9 other smaller overlapping rearrangements involving the described 2p15–p16.1 region have been reported in controls, including a recurrent 2.9 Mb duplication in neurotypical individuals (58.2–61.1 Mb; http://projects.tcag.ca/variation/). A subject with a de novo duplication of band 2p16.1 (60.5–61.4 Mb) was found in one individual who did not physically resemble Subjects 1 and 2, and did not have an ASD (http://decipher.sanger.ac.uk/; patient Id 1570). Finally, recurring de novo microdeletions of various sizes (0.578 Mb) have been detected within the described 2p15–p16.1 region,10, 11, 12, 13 with one of four having a confirmed ASD. All subjects presented with several phenotypic traits overlapping those in our patients, including ID and facial dysmorphologies such as high palate, smooth upper vermillion border, everted lower lip, broad and high nasal root, and telecanthus.

The above findings strongly suggest that the 2p15–p16.1 region is unstable. To determine the frequency of 2p15–p16.1 deletions among individuals with ASDs, we tested 798 individuals with an ASD for the presence of microdeletions similar to those described in Subjects 1 and 2; none were found. Furthermore, no similar cases have been reported from five other published studies,31, 32, 33, 34, 35 with at least 3410 subjects with an ASD evaluated for CNVs. However, Liang et al13 reported a case with the 2p15–p16.1 deletion with autism. Thus, the 2p15–p16.1 microdeletion has a prevalence of less than 0.1% among individuals with an ASD.

We selected XPO1 and OTX1 as being of greatest interest for candidate gene testing. We found that SNP rs6735330 in XPO1 was significantly associated with autism in the family-based studies in all four cohorts even after FDR adjustment, with association being highly significant in ASD–CARC cohorts (PFDR=1.29 × 10−5), the AGRE cohort (PFDR=0.001), and the combined family sets (PFDR=2.34 × 10−9) (Table 1). The common allele was significantly associated with more severe deficits in social interaction, verbal communication, and repetitive behaviors (all PFDR values <0.01), suggesting that it is in high LD with a functional polymorphism or mutation in this gene, which could be a risk factor for autism in the families studied. Given the role of CMR1 in nuclear export of cargo RNAs with a nuclear export signal, including ribosomal RNAs, U snRNAs, SRPs, and specific mRNAs, it is possible that compromise in the transport of any of these molecules could affect brain development, and lead to behavioral features characteristic of persons with ASDs.

OTX1, although deleted in only 1/3 of subjects with autism (and 1/6 subjects with the deletion reported so far), is in the vicinity of the deletion breakpoints and has a function in forebrain and sensory organ development.36 We reasoned that there could be a position effect on the expression of this gene as has been shown for other genes near chromosome breakpoints.37, 38, 39, 40, 41, 42 The role of OTX1 in sensory organ development is interesting, considering that sensory impairments are common in individuals with autism. Leekam et al43 tested individuals with autism using the Diagnostic Interview for Social and Communication Disorders (DISCO),44 which uses 21 items grouped in seven domains (auditory, visual, touch, smell/taste, etc.), and found that they had a higher mean score than the control groups, with more than 90% presenting with symptoms in more than one sensory domain. In addition, the cortical migration defects detected by MR neuroimaging in both of our 2p15–p16.1 deleted subjects, despite only one having a deletion of this gene, made OTX1 a good target for candidate gene screening, assuming that its expression is affected by position effect variegation.

Although case-control studies did not provide evidence for an association between OTX1 with autism, the family-based studies found a very strong association between two of the three OTX1 SNPs (rs2018650 and rs13000344) and risk for ASD (Table 1). A highly significant association with ASD was also found for the rs2018650G–rs13000344C haplotype (PFDR=2.63 × 10−11). QTDT revealed that at least one copy of the common alleles of these SNPs was associated with more severe ASD symptoms, as determined by the ADI–R, providing evidence for an association between OTX1 and ASD. We also note that OTX1 is expressed in the cerebellum within the external germinal layer, a proliferative zone close to the pia, which is responsible for the production of granule cells. These cells differentiate into the granule cells of the cerebellar cortex, which are reduced in the brains of individuals with autism.45

The exception in our results for both OTX1 and XPO1 is with families from New York and the SIRFA families. It is of interest that many of the New York families are of Italian descent. The lack of association in the SIRFA sample may also be due to patient selection criteria, which excluded any patient displaying major dysmorphic features or CNS anomalies as determined by cranial MRI.

The lack of positive findings for both XPO1 and OTX1 in genome-wide association studies (GWAS) by others46, 47, 48, 49, 50, 51, 52, 53, 54 may reflect the greater genetic heterogeneity in these larger data sets, as we found a clear association with specific core ASD phenotypes. Re-analysis of GWAS separating those with more severe and milder ADI–R symptom scores should be done to test this possibility.

The absence of additional cases of 2p15–p16.1 deletion in our own ASD study cohort (n=798), and at least 2586 cases from other published studies31, 32, 33, 34, 35 suggests that the involvement of 2p15–p16.1 with autism is a rare event. Given that the presence of dysmorphic features represents an exclusion criterion for several ASD studies, this may reduce the frequency of detection of 2p15–p16.1 deletions in other screening studies.

Our findings suggest that the newly described 2p15–p16.1 microdeletion syndrome represents a somatic and neurodevelopmental phenotype inclusive of autism, moderate to severe ID, CNS dysmorphology, and distinctive craniofacial dysmorphology. In addition, functional variants in the XPO1 and/or OTX1 genes in LD with the three SNPs showing strong association with autism, may contribute to autistic behaviors in not only the two 2p15–p16.1 deletion subjects described herein, but also in a larger proportion of individuals with ASD with more severe core ASD symptoms mapping to this region of chromosome 2.