We examined autism-related sex differences for intrinsic functional brain organization across multiple R-fMRI metrics in a large discovery sample of males and females with autism relative to age-group matched NT selected from the ABIDE repositories [
22,
23]. Analyses revealed significant main effects of sex and diagnosis across intrinsic functional connectivity (iFC) of the posterior cingulate cortex, regional homogeneity and voxel-mirrored homotopic connectivity (VMHC) in several cortical regions. Notably, main effects converged along the midline of the default network. In contrast, sex-by-diagnosis interactions were limited to VMHC in the superior lateral occipital cortex. Placed in the context of sex and diagnostic main effects on interhemispheric homotopic connectivity in cortical regions, this result suggests that atypical interhemispheric interactions are pervasive in autism but reflect a combination of sex-independent (i.e., main effect of diagnosis common across sexes) and sex-dependent (i.e., sex-by-diagnosis interaction) effects, each specific to a different functional cortical system. This sex-by-diagnosis interaction effect was robust to distinct pre-processing strategies as those observed for main effects. Further, despite the lack of a priori harmonization for data acquisition among the three samples, this finding was replicable in the larger of the two independent samples (i.e., EU-AIMS LEAP). On one hand, this, together with largely replicable main effects of sex with variable replicability of main diagnostic effects by sample, suggests that inter-sample replicability of R-fMRI can be feasible in autism when sources of variability in diagnostic groups are accounted for in samples sized properly to address such variability. On the other hand, our results highlight the urgent need to obtain multiple harmonized datasets properly powered to systematically address and understand sources of heterogeneity, including and beyond the role of biological sex.
Sex-dependent and sex-independent atypical interhemispheric interactions in autism
VMHC reflects inter-hemispheric homotopic relations. The strength has been suggested to index coordinated cross-hemispheric processing:
stronger VMHC indexes weaker hemispheric specialization and vice versa [
33,
70]. Several lines of evidence support the notion that the neurobiology of autism is related to atypical hemispheric interactions, including homotopic connectivity and hemispheric lateralization [
35,
71‐
80]. VMHC and functional hemispheric lateralization have also been shown to be sex-differential in NT [
33,
81,
82]. The dorsolateral occipital association cortex identified in our discovery analyses is known to serve hemispherically specialized processes, such as visuospatial coordination [
83]. Thus, our findings of NT males’ VMHC in dorsolateral occipital cortex being lower than that of NT females are consistent with the notion of increased hemispheric lateralization in this cortical region in NT males relative to NT females. In our data, females with autism instead showed even lower VMHC than NT males, while males with autism showed slightly higher VMHC than NT males. This pattern is indicative of ‘gender-incoherence’ [
20] as males and females with autism display the opposite pattern expected in NT per their biological sex. Findings of ‘gender incoherence’ have been reported in earlier neuroimaging studies of autism using different modalities [
3,
84,
85]. Among them, several R-fMRI studies explicitly focusing on detecting sex-by-diagnosis interactions (i.e., the regression model included a sex-by-diagnosis interaction term) [
3,
11] yielded a pattern of results consistent with ‘gender incoherence.’ In contrast, other studies [
12‐
14] reported a pattern consistent with the ‘extreme male brain’ model [
19]—i.e., a shift towards maleness in both females and males with autism. While the seemingly diverging conclusions of these two sets of studies may be attributed to methodological differences, such as the extent of brain networks explored and the statistical modelling employed, findings from our prior work suggest that both shifts towards either maleness or femaleness co-occur in the intrinsic brain of males with autism, in a network-specific manner [
2]. However, such prior work did not include female data. Thus, by not directly assessing sex-by-diagnosis interactions, unlike the present study, results could not point to patterns affecting diagnostic differences between the sexes versus those that are common to autism across sexes [
4]. This is relevant for efforts focusing on identifying underlying mechanisms. Findings resulting from sex-by-diagnosis interactions may shed light on sex-differential mechanisms that are atypical in autism and may reflect sex-specific susceptibility mechanisms. On the other hand, atypicalities common for both sexes may reflect factors central to the emergence of autism, regardless of whether they overlap with patterns known to be differential between sexes [
86]. Interestingly, a recent study based on a sample selected from GENDAAR [
16] revealed that the iFC between the nucleus accumbens (selected a priori) and a region of the dorsolateral occipital cortex partially overlapping with that identified by our VMHC analyses, was differentially modulated by the aggregate number of oxytocin receptor risk alleles in females with autism versus NT females and versus males with autism. Although VMHC was not directly tested in the said study [
16], its result in dorsolateral occipital cortex is consistent with our observation of atypical sexual differentiation of this visual network region and, together, suggest the need for future whole-brain studies of oxytocin effects in autism.
Along with the sex-dependent autism patterns, our analyses found statistically significant main effects of diagnosis in inter-hemispheric interactions indexed by VMHC in distinct cortical circuits. These were localized along the midline of the DN (paracingulate/frontal cortex consistently and PCC/precuneus) where main effects of PCC-iFC and ReHo also converged. Our results are consistent with prior reports of atypical intrinsic organization of the DN in autism [
12,
23,
26,
87‐
89]. Together they support a common, sex-independent role of DN in autism. This is also highlighted by a recent autism neurosubtyping study that identified three latent iFC factors, all sharing DN atypicalities along with their neurosubtype-specific patterns [
90]. Building on this evidence to disentangle the specific role of each of the factors affecting autism in sex-independent and sex-dependent ways, a necessary next step is to engage in novel large-scale data collection efforts including more female data.
Robustness, replicability and sources of variability
The growing awareness of the replication crisis in neuroscience [
91‐
93] motivated our analyses examining robustness and replicability of findings. While a comprehensive and systematic reproducibility assessment is beyond the scope of the present study, here we focused on examining whether the findings observed in the discovery analyses were also seen after using different preprocessing pipelines—
robustness—as well as in fully independent, albeit of convenience samples (i.e., not harmonized a priori with each other)—
replicability. To this end, given the lack of consensus on quantitative metrics of replicability, we opted to use measures of effect size. These are considered complementary to null hypothesis significance testing [
94]. In the context of this study, given the use of convenience samples of different sizes, their selection was considered an advantageous and practical means to provide information on the magnitude of group differences in diagnosis and sex, as well as their interaction. Here, we considered findings to be robust and/or replicable for any non-negligible effects (i.e.,
ηp2 ≥ 0.01 [
95]). We reasoned that given their distributed and heterogenous nature [
6], atypicalities in the autism connectome can stem from a combination of differently sized non-negligible effects, as shown for autism in other biological domains such as genetics [
96,
97].
With this in mind, across the two preprocessing methods examined here, the patterns of findings were consistent with those observed in discovery analyses across all R-fMRI metrics and effects. These robust results are consistent with a prior study by He and colleagues [
46] reporting that differences in a wider range of pre-processing pipelines have marginal effects on variation in diagnostic group average comparisons. Our study confirms and builds on this earlier report by extending findings of robustness to sex group mean differences and their interactions with diagnosis.
A more nuanced picture emerged from the inter-sample analyses as replicability varied by sample, across the effects and R-fMRI metrics examined. Specifically, while inter-sample main effects of sex were moderately to largely replicable across R-fMRI metrics on both independent samples (~ 50 to 80% of the clusters in GENDAAR and EU-AIMS-LEAP, respectively), replicability of diagnostic effects significantly varied by sample (86 to 29%) across R-fMRI metrics. This is at least in part consistent with findings by King et al. [
50] who showed that, depending on the R-fMRI feature examined, diagnostic group differences varied across samples. Even in this scenario, King et al. [
50] also reported that findings of decreased homotopic connectivity in autism were relatively more stable than other R-fMRI metrics. This observation, combined with the replicability of our VMHC sex-by-diagnosis interaction findings in the larger of the two independent samples (EU-AIMS LEAP), suggests that measures of homotopic connectivity may have specific biological relevance for autism. It is also possible that given the moderate to high test–retest reliability, VMHC is more suitable in efforts assessing replicability [
98,
99].
The striking clinical and biological heterogeneity in autism should be considered as a major contributor to discrepancies in findings of studies focusing on the main effects of diagnostic group means contrasts/interactions [
100‐
103]. Against this background, we interpret our replicability findings on diagnostic effects and, in turn, diagnosis-by-sex interactions. Inter-sample differences may have contributed to the more variable results of replicability on the diagnosis main effects. These may include autism symptom level, age, and IQ, albeit secondary analyses suggested that the examined IQ range did not substantially affect the pattern of discovery results. For example, the EU-AIMS LEAP sample was on-average older, had lower VIQ and most notably, lower symptom severity across all subscales of the ADOS and ADI-R than the ABIDE sample. On the other hand, the GENDAAR sample (which has greater number of replicable diagnostic mean group patterns) did not differ from ABIDE in these variables, except for mean age. Furthermore, a fact that is often neglected, is that the NT groups may also present with considerable sample heterogeneity between studies [
100,
104]. For instance, our NT controls in the EU-AIMS LEAP sample had lower VIQ than both ABIDE and GENDAAR NT controls. This has potentially influenced the low replicability of diagnosis main effects in EU-AIMS LEAP.
In contrast, sex-by-diagnosis interaction effect on VMHC in the dorsolateral occipital cortex was replicable in the larger sample, the EU-AIMS LEAP, but not in GENDAAR. Small samples introduce larger epistemic variability (i.e., greater variation related to known and unknown confounds) [
105]. Increasing the number of subjects/data allows mitigating epistemic variability and, thus, capturing the underlying variability of interest. Thus, although the rate of replicability for the main effect of diagnosis in EU-AIMS LEAP was limited, accounting for biological sex, a known key source variability in autism, may have substantiated a replicable sex-by-diagnosis pattern in this larger sample. In line with sample size concerns, using four datasets sized between 36 and 44 individuals selected from the ABIDE repository, He et al. [
46] found low similarity rates of diagnostic group-level differences on the strength of iFC edges in contrast with the largely similar pattern of results across pipelines. Of note, unlike prior efforts [
46,
50], we controlled for site effects within each of the samples (i.e., ABIDE, GENDAAR and EU-AIMS LEAP), using ComBat. Future large-scale harmonized data collections are needed to control and assess the impact of inter-sample variability. Taken together, these findings highlight that sample differences can impact replicability.
Beyond clinical and biological sources of variation, samples may differ in MRI acquisition methods, as well as in approaches used to mitigate head motion during data collection and its impact on findings [
106]. Adequately controlling for head motion remains a key challenge for future studies assessing inter-sample replicability. For the present study, we excluded individuals with high motion, retained relatively large samples with group average low motion (mean ± standard deviation of mFD range = 0.09–0.16 ± 0.06–0.10 mm), as well as included mFD at the second-level analyses as a nuisance covariate. Overall, the extent to which each sample-related factor affects replicability needs to be systematically examined in future well-powered studies. Only this type of studies will allow for emerging subtyping approaches to dissect heterogeneity by brain imaging features using a range of data-driven methods [
107,
108], including normative modelling [
72,
109,
110].
Inter-sample differences and methodological differences, beyond nuisance regression, may have contributed to some differences in findings between the present and earlier studies, conducted with independent or partially overlapping samples [
11,
23,
41]. For example, Alaerts et al. [
11] also examined sex-by-diagnosis interaction in PCC-iFC in a dataset selected from ABIDE I only. Although their pattern of results was consistent with the ‘gender incoherence’ model, the resulting circuit(s) did not involve the dorsolateral occipital cortex as identified with VMHC in the present study. Along with differences in samples selected from the same data repositories, other methodological choices may also affect results
. For example, prior studies differed with the present one in the inclusion of sex-by-diagnosis interaction [
17], the extent of the whole-brain voxel-based analyses [
15,
16], or the statistical threshold utilized [
23]. Nevertheless, it is remarkable that even in light of these differences, consistent results have emerged including the overarching atypical inter-hemispheric interactions in autism, and sex-dependent and sex-independent atypical intrinsic brain function across distinct functional networks.