This study examined whether different aspects of functional MRI connectivity described in the literature represent distinct symptoms of autism or cohorts of individuals with autism and the extent to which these functional connectivity methods exhibit reproducibility across individuals and datasets. For all functional connectivity methods tested, results showed poor generalizability across sites and participants rather than clear reproducibility, with no method demonstrating highly reproducible results when compared to the entire multi-site ABIDE sample (see Fig.
13 for summary figure with
p < .05 (uncorrected) results; (
q [FDR] < .05 corrected results can be found in Additional file
1: Figure S8). When functional connectivity features were compared to behavioral symptoms, distinct types of analysis, such as corticostriatal, homotopic, or default mode connectivity, did not correlate with different types of autism symptoms. Rather, individual connections or features that tracked with different symptoms of autism were distributed across methodological approaches without any clear pattern.
Literature comparisons
In the current study, few sites demonstrated significant between-group differences in positive vs. negative functional connectivity assessed using bins of connections with similar connectivity in the independent Human Connectome Project sample. Short-range and long-range connectivity results were also inconsistent across sites, consistent with a recent analysis demonstrating only region-specific local overconnectivity using a regional homogeneity approach, with different subgroups of subjects demonstrating variably higher or lower long-distance connectivity [
62]. Variability in local connectivity has also been demonstrated with cohorts differing in fMRI acquisitions with eyes open vs. eyes closed [
63]. Theoretical proposals of long-range under-connectivity and short-range over-connectivity [
1,
18‐
20] have been variably defined in terms of distance ranging from cortical columns to many centimeters, and studies examining distant connections have produced variable over- and under-connectivity. The analysis in the current sample may be limited by the use of an independent dataset (Human Connectome Project) not matched for sex and age to define distances.
Encouragingly, the current study found decreased homotopic connectivity which has been reported and replicated in the literature [
28,
29]. Though there were some sites that demonstrated increased homotopic connectivity, findings reaching multiple comparison correction were nearly all decreased in individuals with autism compared to controls.
Consistent with the literature, this study also found both hypo- and hyper-connectivity in corticostriatal connections [
3‐
7]; however, the direction of these findings was not consistent between research sites, and it appears to be predominantly decreased in individuals with autism compared to controls when larger sample sizes are assessed. Similar incongruities were found with thalamocortical connectivity which also varies in the literature with respect to directionality [
4,
9].
In the current study, idiosyncrasy was estimated by calculating the variance between an individual’s time series data for each ROI and an averaged value based on an independent dataset. Two sites showed decreased idiosyncrasy in individuals with autism compared to controls. This finding is in contrast to outcomes in the literature that report increased idiosyncrasy in individuals with autism compared to controls [
26,
27]. This inconsistency is likely attributable to methodological differences as the majority of studies in the literature utilize machine learning techniques to establish idiosyncratic values.
Similarly, the current study found no between-group differences in modularity for any research site or combined dataset; however, increased global efficiency was found in individuals with autism compared to controls for two research sites. Both increased and decreased global efficiency in individuals with autism compared to controls have been demonstrated in the literature [
21,
22]. With regard to within and between default mode and salience networks, the findings in the current study closely mirror those from the literature [
3,
10‐
15]; however, it is important to note that few research sites demonstrated multiple comparison corrected findings (see Additional file
1). Indeed, widespread decreased connectivity meeting FDR correction was only evident in the larger combined ABIDE dataset with respect to inter-default mode network connectivity.
Overall, we found poor generalizability across sites when testing which functional connectivity features predict autism, with consistent results only for samples of hundreds of participants. Furthermore, different types of functional connectivity features (homotopic, thalamocortical, corticostriatal, specific networks) seem to not consistently predict different features of autism. Rather, specific features that predict autism symptoms seem to be distributed across feature types. Interestingly, there is a web of interrelationships between which features are high in which participants, with only lagged connectivity not showing correlation across individuals with autism with other feature types. As more features are added together, consistent results are obtained regardless of which feature types are added. It may be that these findings reflect global connectivity, which predicts ADOS and ADI scores but not SRS scores. Indeed, measures of global connectivity have been found to decrease in individuals with autism compared to controls [
34,
64]. Even when using modern acquisition strategies (multi-band data, 30-min acquisition times per participant), heterogeneity and modest prediction rates for autism are seen, although findings were very consistent with those obtained from the entire ABIDE sample, suggesting that long-duration, high-temporal resolution acquisitions may improve replicability of results. Holiga and colleagues used data from 4 separate datasets including ABIDE I and ABIDE II and examined reproducibility of degree centrality as a metric distinguishing autism from control individuals [
8]. While effect sizes were large in the EU-AIMS and InFoR datasets (Cohen’s
d > .8 for some measures), effect sizes were much smaller in ABIDE datasets (
d = .2), possibly indicating that technical parameters of acquisition may contribute to reproducibility of the results, given that EU-AIMS data were acquired with more volumes and using a multi-echo technique. Similarly, a recent report imaging participants with autism and low cognitive and verbal performance identified similar connectivity differences to the entire ABIDE sample in this report within a single site’s data [
65]. While none of the individual features tested show promise in this analysis as sensitive and specific biomarkers, consistent with recent reviews [
66,
67], the individual features demonstrated a rich web of interrelationships across subjects as well as differences across subjects that may inform efforts to identify clinical subtypes [
68] or use multi-parametric deep-learning approaches to arrive at more sensitive and specific imaging markers [
69].
Limitations
A number of limitations should be considered. Though we consider all participant data being processed using the same parameters a strength of this study, certain aspects of that model could act as a confound. For example, differences in acquisition parameters, volume numbers, and spatial scale may benefit from preprocessing pipelines more suited to the nuances of each study site with the ABIDE dataset. Second, while efforts were made to minimize variance due to differences in the research site, statistical controls are likely not able to account for more nuanced between-site variance. Third, while we attempted to replicate methods identified in the literature, all method tests were based on a common parcellation scheme that was created using imaging data from adult participants. Many of the participants included in the ABIDE dataset are children or adolescents. Thus, it may be that lack of reproducibility across methods reported in this study are tied to differences in cortical parcellations, nuanced atlas registration, or changes across development; though, many of the methods tested do not require extremely precise cerebral region assignment (long-range vs. short-range, positive vs. negative, etc.), and age was included as a covariate in all analyses. Finally, it cannot be overstated that strategic choices in image postprocessing have a clear effect on functional connectivity results [
70], and different choices in postprocessing may have resulted in improved or poorer reproducibility.