Subjects and assessments
This study was approved by the Institutional Review Board for the Protection of Human Subjects (IRB) of the State University of New York (SUNY) at Upstate Medical University in Syracuse, New York. Subjects were recruited from the greater Syracuse area through the SUNY Upstate Pediatric and Adolescent Center and the SUNY Upstate Center for Development, Behavior, and Genetics. Exclusion criteria for both control and ASD subjects included an age less than 4 years or greater than 14 years, confounding neurological (i.e. cerebral palsy, epilepsy) or sensory (i.e. auditory or visual impairment) disorders, or acute illness. Wards of the state, subjects with mental retardation or a history of pre-term birth (less than 32 weeks gestation) or birth weight less than 10th percentile for gestational age were also excluded from participation. Subjects with a diagnosis of intellectual disability, ASD, or a family history of ASD were excluded from the control group. ASD subjects with a known syndromic phenotype (i.e. Rett Syndrome, Tuberous Sclerosis, Angelman Syndrome, Fragile X) were also excluded. Given the established comorbidity of psychiatric symptoms in children with ASD, subjects with attention deficit hyperactivity disorder (ADHD) or anxiety were not excluded.
Informed written parental consent and informed written subject assent (when possible) was obtained for a total of 45 subjects who were recruited for the study, including 24 subjects with a current diagnosis of ASD and 21 non-ASD control subjects (Table
1). ASD subjects were diagnosed according to DSM-5 (American Psychiatric Association, 2013) criteria and were evaluated with an age-appropriate module of the Autism Diagnostic Observation Schedule (ADOS), the Childhood Autism Rating Scale (CARS), and/or the Krug Asperger Index. The Vineland Adaptive Behavior Scales 2nd edition was administered to all children by a physician through parental interview to evaluate functional neurodevelopmental indices of communication, social interaction, and activities of daily living. Medical history, birth history, family history, surgical history, current medications, medical allergies, immunization status, and dietary modifications were obtained. A brief physical exam was performed to screen for neurologic deficits, visual/hearing impairment, or syndromic physical features.
Table 1
Subject characteristics
Mean | 9.2 | 16M, 5F | | 110.1 | 104.4 | 100.4 | 105.3 | 39.2 | 78.2 | 68.4 |
StDev | 2.5 | | | 10.0 | 15.7 | 11.0 | 12.7 | 1.3 | 16.5 | 20.6 |
Range | 4–13 | | | 88–127 | 81–146 | 85–124 | 87–132 | 36–42 | 50–100 | 33–97 |
ASD | | | | | | | | | | |
Mean | 9.1 | 19M, 5F | 10.6 | 76.0 | 77.8 | 73.6 | 70.7 | 38.3 | 64.7 | 59.5 |
StDev | 2.4 | | 4.1 | 15.3 | 14.3 | 10.9 | 10.2 | 2.5 | 29.6 | 25.7 |
Range | 5–13 | | 3–16 | 49–113 | 47–108 | 52–95 | 48–90 | 31–41 | 5–99 | 10.99 |
p value | 0.182 | 0.816 | | 0.001 | 0.001 | 0.000 | 0.000 | 0.294 | 0.915 | 0.848 |
There were no significant differences between groups in age (
p = 0.18), sex (
p = 0.82), weight (
p = 0.91), height (
p = 0.85), or birth age (
p = 0.29). The mean age of the ASD subjects was 9.2 ± 2.5 years and the mean birth weight was 3.2 ± 0.64 kg. The ASD subjects had a mean ADOS score of 10.6 ± 4.1, consistent with DSM-5 criteria for mild to moderate ASD. Compared with control subjects they displayed significantly decreased levels of Communication (
p <0.001), Social Interaction (
p = 0.001) and Activities of Daily Living (
p <0.001) as assessed by Vineland Adaptive Behavior Scales (Table
1).
Overall, the ASD group of 24 children included several with comorbid diagnoses: ADHD (n = 15), anxiety disorder (n = 8), learning disability or developmental delay (n = 5), asthma (n = 3), allergies (n = 2), obsessive-compulsive disorder (n = 2), and depression (n = 1). Reported medications in this group included: methylphenidate stimulants (n = 8), serotonin specific reuptake inhibitors (SSRIs; n = 7), guanfacine (n = 5), atypical antipsychotics (n = 5), clonidine (n = 1), bronchodilators (n = 3), anti-histamines (n = 3), multivitamins (n = 8) and omega-3 supplements (n = 4). Three of the probands were eating a modified gluten-free diet and no ASD subjects had any dental carries or periodontal disease. Five ASD subjects had a history of birth complications requiring neonatal intensive care, although none required care beyond 11 days. Most (n = 17) of the ASD subjects had a current or past history of educational intervention (speech therapy, physical therapy, occupational therapy). There were also several probands with positive family histories of neuropsychiatric and neurodevelopmental disorders (limited to 1st and 2nd degree relatives and 1st cousins): learning disability (n = 10), depression (n = 8), anxiety disorder (n = 7), ADHD (n = 6), ASD (n = 4), and bipolar disorder (n = 3).
The control group of 21 typically developing children also included several with comorbid diagnoses: ADHD or ADD (n = 5), asthma (n = 6), eczema (n = 4), and allergies (n = 2). Reported medications in the control group included: methylphenidate (n = 3), bronchodilators (n = 6), and antihistamines (n = 5). None of the control children were eating a modified or gluten-free diet and one subject had dental carries. One control subject had a history of birth complication (RSV infection) that required a brief period of neonatal care. Three of the control subjects had a current or past history of educational intervention (speech therapy, physical therapy, occupational therapy). Positive family histories among 1st and 2nd degree relatives and 1st cousins were identified for learning disability (n = 2), depression (n = 1), ADHD (n = 1), and bipolar disorder (n = 1).
Statistical analysis
Analysis of the combined medical, demographic, and neuropsychological data was performed to identify significant group differences between ASD and control subjects. Individual miRNAs were used for comparisons between groups only if they were detected in at least half the samples regardless of diagnosis. A total of 246 miRNAs were tested. Because the RNA-Seq data were not normally distributed, group differences in miRNA levels were examined using a non-parametric Wilcoxon Mann-Whitney U test with Benjamini-Hochberg False Discovery Rate (FDR) correction for multiple comparisons. The miRNAs with FDR values <0.15 were initially used in individual logistic regression analyses to assess discriminative power in an idealized “best-fit” approach. The rationale for doing so was the fact that logistic regression makes no assumption about the distribution of the original RNA-Seq data and it is highly effective at iteratively determining an optimal model for the data using the logistic function Y = [1/(1 + e-(a + b1X1 + b2X2 + bnXn + …))] that best describes the dependency of the dependent outcome (diagnosis, coded as 0 or 1) on the full set of 14 independent variables. This best fitting is accomplished by adjustment of the partial regression coefficients for each miRNA variable until an optimal solution is obtained using the Maximum Likelihood criterion. During this process, each subject sample is determined to have a specific likelihood of falling in one of the diagnostic classes based on the model and the total likelihood (L) for the set of subjects is derived from the running product of the likelihood scores for all of the subjects. Since a prediction is made for each subject, the results of the logistic regression analysis are then used to produce a 2 × 2 classification table from which we can determine the Sensitivity or True Positive Rate (i.e., fraction of ASD subjects who were correctly predicted to be ASD based on the model) and the Specificity or True Negative Rate (i.e., the fraction of Control subjects who were correctly predicted to be Controls). The cutoff points for the classification were set by default to be Y = 0.5 (halfway between the diagnostic category coding of 0 and 1). By varying the cutoff point across the full range of cutoff values and recalculating the Sensitivity and Specificity at each point, it is then possible to construct a Receiver Operating Characteristic (ROC) curve which provides an unbiased assessment of the overall model performance.
To facilitate comparisons with other data sets, mean differences in abundance seen in ASD subjects were reported as normalized Z score differences relative to controls as well as standardized Cohen’s d values, which incorporate the variability within each subject group. We also reported the Wald statistics with resulting p values for each of the individual regression results. Comparisons of miRNA levels to various medical, demographic and neuropsychological measures were performed using Spearman’s rank correlation.
One of the limitations of any regression modeling approach is the possibility that the “best fit” only accurately predicts outcomes in the initial (discovery) data set. To more stringently evaluate the empirical validity of the 14 miRNAs, we performed classification testing and ROC curve analysis based on the results of 100-fold Monte-Carlo Cross Validation (MCCV) with balanced subsampling. In each iteration two-thirds of the samples were used to evaluate the miRNA feature importance. Next, the 2, 3, 5, 7, 10 and 14 most important classifying miRNA features were used to build classification models which were cross-validated on the remaining one-third of the samples that were left out. This was repeated 100 times to determine the performance and confidence interval of each model. To further complement the logistic modeling we did in the discovery phase, this MCCV analysis was performed using the multivariate linear regression approach of Partial Least Squares Discriminant Analysis (PLS-DA). This method extracts multidimensional linear combinations of the 14 miRNA features that best predict the class membership or diagnosis (Y). These analyses were performed using the Metaboanalyst 3.0 server which implements the plsr function provided by the R pls package, with classification and cross-validation performed using the caret package. We also used this tool to rank the variables by their relative importance, as determined by the sum of regression coefficients in the different simulated models, and generate individual boxplots for the 4 most robust differentially expressed miRNAs.
To visualize the expression patterns and general separation power of the set of significantly changed miRNAs, we then used hierarchical clustering with a Euclidian distance metric to group miRNAs with similar patterns together, and visualized the subjects in the three eigenvector dimensions created from the PLS-DA analysis of the 14 miRNAs.
Systems-level analysis of the miRNA data was performed using the miRNA Data Base (miRDB) online resource to provide the predicted targets for each of the mature sequences that we identified (according to mirBase v21 August 2014 annotation). This database version identifies 2588 human miRNAs and 947,941 target interactions. The interactions that were revealed were then filtered based on the predicted strength of the miRNA-mRNA interaction, as reflected in the miRDB output, to include only the top 20 % of predicted targets for each miRNA. These specific mRNAs were then examined for evidence of functional enrichment using the online Functional Annotation Clustering tool from the Database for Annotation, Visualization, and Integrated Discovery (DAVID, version 6.7) at the National Institute of Allergy and Infectious Diseases (NIAID). Because of the large number of genes being examined, we increased the EASE score threshold to 2.0, and set the Multiple Linkage Threshold to 0.7, the Similarity Threshold to 0.45, and the Final Group Membership size to 4. Only the top three Annotation Clusters were reported in table form. In addition to the DAVID functional clusters, we also compared the list of the top 20 % of predicted targets for the combined set of 14 miRNAs to the 740 ASD-associated genes catalogued in the Simons Foundation Autism Database (AutDB) and tested for possible enrichment using a Fisher’s Exact test and Odds Ratio calculation.
Brain and tissue-specific expression patterns for differentially expressed miRNAs were identified by review of a survey of differentially expressed miRNAs across the developing and adult human brain [
14,
15]. We also used the brain data to note whether miRNAs that were highly-expressed in brain were also detected in the saliva regardless of whether they were altered in ASD.