Introduction

Expectations shape our experience of reality, and are essential for adaptive behaviour in any complex environment. In the face of danger, negative expectations are formed and dynamically updated by experience, involving associative learning and memory processes. As a key emotional response during the expectation of threat, conditioned fear is essential to trigger adaptive escape or avoidance responses. Learned fear can however also turn maladaptive and contribute to pathology, as underscored by knowledge from fear conditioning accomplished in the context of anxiety, psychological trauma, and stress-related disorders1. More recently, the scope has been broadened to acute and chronic pain2, embedded within the fear-avoidance model3. In keeping with its biological salience, pain is a ubiquitous and fundamentally threatening experience. Interoceptive pain arising from visceral organs appears to be particularly threatening and fear-inducing4,5, resulting in much suffering in highly prevalent disorders of the gut–brain axis like the irritable bowel syndrome (IBS)6. Interoceptive, visceral pain is highly modifiable by cognitions and emotions6,7, including negative expectations during the anticipation of pain as key mediators of nocebo effects8,9,10,11,12. Despite broad clinical implications of nocebo effects reaching far beyond chronic pain13,14,15, the formation and persistence of negative pain-related expectations by classical conditioning remain incompletely understood, especially with respect to neurobiological mechanisms and their possible specificity to threat modality.

Human fear conditioning studies with experimental pain as unconditioned stimuli (US) have implemented exteroceptive, somatic16,17,18,19 or interoceptive, visceral pain as salient threats20,21,22,23, but knowledge about common and distinct threat-specific neural mechanisms remains limited, especially regarding the extinction and retrieval of pain-related fear memories24,25. Combining interoceptive and exteroceptive threats from different sensory modalities, as accomplished herein, constitutes a unique opportunity to elucidate specificity to threat modality in a clinically-relevant context4,26, and is timely given recent conceptual advances regarding interoception27,28 and interoceptive psychopathology29. The experience of multiple threats from different sensory modalities closely mimics the clinical reality of patients with diverse symptoms, especially those with complex comorbidities as they often characterise patients with chronic pain. In fact, any normal environment presents multiple salient threats, which may impact learning and memory processes relevant to nocebo effects that remain incompletely understood even in healthy individuals. Existing studies with multiple threats support the role of the insula, a key region of the salience network, in sensory modality-specific effects underlying aversive expectancy26,30,31,32. The engagement of the salience network, together with regions of the fear and extinction networks, remains to be tested not only for the formation but especially for the extinction of conditioned responses to multiple threats. Conditioned negative expectations may be markedly resistant to extinction, as suggested by studies involving somatic pain stimuli33,34. This may be particularly the case for interoceptive memory traces, as suggested by early classical interoceptive conditioning studies carried out by soviet psychologists35, complemented by modern approaches on fear learning of interoceptive and exteroceptive cues36 and on the partially distinct neural representation of aversive visceral signals37. Impaired extinction efficacy and other phenomena related to memory processes can reportedly facilitate the return of fear and increase the risk of relapse38, with broad implications for the chronicity and treatment of pain and fear-related disorders39.

We herein elucidated the behavioural and neural mechanisms involved in the acquisition and extinction of negative expectations towards different types of interoceptive and exteroceptive threats across sensory modalities. To this end, we analysed data from two independent differential fear conditioning studies with methodological and conceptual overlap, allowing to assess reproducibility, and offering converging insight into conditioned anticipatory responses to threats from different sensory modalities. In both functional magnetic resonance imaging (fMRI) studies, visceral pain induced by rectal distension was implemented as clinically-relevant interoceptive US together with an exteroceptive US, which was either an equally painful thermal stimulus (study 1), or a non-nociceptive, yet equally aversive tone (study 2). In addition to threat modality-specific predictive cues (conditioned stimuli, CS), unpaired safety cues in both studies allowed us to compare conditioned differential responses to interoceptive versus exteroceptive threat predictors for different phases of conditioning. For the acquisition, we tested the general hypothesis that in the face of multiple threats, predictive learning is shaped by the salience of the US, as suggested by preparedness theory40 and the evolutionary significance of the interoceptive modality, as illustrated by one-trial learning phenomena like conditioned nausea and taste aversion41. Given initial evidence that pain-modality shapes not only the perception and processing of stimuli4,5,42, but also anticipatory responses including conditioned fear26,36, we expected greater differential conditioned responses involving regions of the fear and salience networks to cues predicting interoceptive threat. We further tested whether conditioned responses to interoceptive threat predictors are more resistant to effective extinction, involving regions of the extinction network. The return of conditioned responses induced by reinstatement, i.e. unexpected re-exposure to the US, constitutes a promising translational tool to assess extinction efficacy38 that has rarely been applied in brain imaging studies on pain-related fear20. To this end, our paradigms incorporated different reinstatement procedures following extinction phases, allowing us to test in reinstatement-test phases if the interoceptive CS–US association is more susceptible to reinstatement effects.

In sum, conditioned responses to interoceptive threat predictors were enhanced in both studies after the acquisition, consistently involving the insula and cingulate cortex as key regions of the salience network. Our results supported that unexpected exposure to interoceptive threats had a greater impact on extinction efficacy, resulting in disruption of ongoing extinction (study 1), and selective resurgence of interoceptive CS–US associations after complete extinction (study 2). Together, our findings are an important step towards unravelling how negative expectations are shaped by associative learning and memory processes in the face of multiple threats. A more refined understanding of conditioned nocebo effects in the context of clinically-relevant interoceptive and exteroceptive threats may ultimately contribute to an improved consideration of expectancy effects to the benefit of patients, broaden the rapidly evolving scope of the gut–brain axis in health neuroscience and disease43,44,45, and fits into a framework of the cognitive neurosciences interfacing between mind and body27.

Results

Participants

Out of a total of N = 77 healthy adults who participated, N = 12 were excluded due to technical difficulties with MRI data acquisition (N = 6), movement artefacts (N = 4), or failure to reach visceral pain threshold within predetermined maximal distension pressure (N = 2). As a result, we herein report on data from N = 42 volunteers for study 1 (all female, age 34.5 ± 2.0 years; BMI 22.7 ± 0.4 kg/m2), and N = 23 volunteers for study 2 (10 female, age 26.7 ± 1.0 years; BMI 22.4 ± 0.7 kg/m2). Consistent with stringent and highly-parallelised exclusion criteria, in both samples mean gastrointestinal symptom scores (study 1: 3.73 ± 0.4; study 2: 1.65 ± 0.4), and HADS scores for anxiety (study 1: 4.24 ± 0.4; study 2: 3.00 ± 0.4) and depression (study 1: 2.61 ± 0.3; study 2: 1.04 ± 0.3) were low. Chronic stress scores were well-within the normal range (study 1: 36.93 ± 1.5; study 2: 33.57 ± 1.5). Pain thresholds for visceral pain (study 1: 40.0 ± 1.7 mmHg; study 2: 40.2 ± 1.6 mmHg) and for thermal heat pain (study 1: 44.9 ± 0.5 °C) were well-within the range of findings in our previous work4,21,46. Unpleasantness thresholds for auditory stimuli (93.96 ± 1.52 dB) SPL (range: 75–108 dB SPL) were within expected ranges and comparable to previous studies from our own group47,48.

Unconditioned stimuli: acquisition phase

Negative valence was consistently greater for interoceptive compared to exteroceptive threats implemented as US in both studies, despite careful matching to US pain intensity (study 1) and US unpleasantness (study 2) prior to acquisition. This was indicated by significantly greater post-acquisition USVISC unpleasantness ratings compared to USSOM in study 1 (USVISC: 66.14 ± 3.8 mm, USSOM: 24.62 ± 5.1 mm; t(41) = 6.62; P < 0.001; d = 1.43), as well as to USAUD in study 2 (USVISC: 79.30 ± 3.2 mm, USAUD: 62.78 ± 5.3 mm; t(22) = 3.42; P = 0.005; d = 0.75). Note that in study 1, post-acquisition USVISC were also perceived as more intense (USVISC: 77.3 ± 1.5 mm, USSOM: 66.2 ± 2.4 mm; t(41) = 4.23; P < 0.001; d = 0.83), which was highly intercorrelated with US unpleasantness within both modalities (USVISC: r = 0.71; P ≤ 0.001; USSOM: r = 0.65; P ≤ 0.001). At the neural level, USVISC induced enhanced activation when compared to USSOM (Fig. 1a) as well as compared to USAUD (Fig. 1b), involving aINS and dACC in both studies, and additionally the amygdala in study 1 (Table 1). Further, interoceptive USVISC consistently induced lower neural activation compared to both exteroceptive USSOM and USAUD within pINS (Table 1). Of note, these differences between US modalities remained largely unchanged when considering post-acquisition differences in US unpleasantness (study 1, 2) or US intensity ratings (study 1 only) as covariates of no interest (Supplementary Tables S1 and S2). Moreover, conjunction analyses (against global null) revealed shared neural activation in study 1 (USVISC ∩ USSOM) in the aINS, whereas shared activation was detectable in study 2 only in uncorrected whole-brain analyses but not in FWE-corrected ROI analyses (USVISC ∩ USAUD, see Supplementary Table S3).

Fig. 1: Differences in neural activation induced by interoceptive versus exteroceptive threats implemented as unconditioned stimuli (US) during acquisition in studies 1 and 2.
figure 1

Interoceptive threat (USVISC) induced significantly greater neural activation in dACC and aINS compared to both exteroceptive threats (a study 1, compared to USSOM; b study 2, compared to USAUD; all PFWE < 0.05). Moreover, interoceptive and exteroceptive threats induced shared neural activation in different brain regions across studies (full results in Supplementary Table S3). Neural activations in regions of interest were superimposed on a structural T1-image and thresholded at P < 0.001 uncorrected for visualisation purposes; colour bars indicate t-scores. For details, see Table 1. For whole-brain results on differential activation, see Supplementary Fig. S4. aINS anterior insula, AUD auditory, dACC dorsal anterior cingulate cortex, FWE family-wise error, SOM somatic, US unconditioned stimuli, VISC visceral.

Table 1 Differences in neural activation induced by interoceptive versus exteroceptive threats as unconditioned stimuli (US) during acquisition.

Conditioned stimuli: acquisition phase

Repeated CS+-US pairings during acquisition resulted in the conditioned negative valence of all threat predictors in both studies, as evidenced by significant rmANOVA time effects (Supplementary Tables S4 and S5). Interestingly, this increase in negative valence was consistently enhanced for interoceptive (ΔCS+VISC) versus exteroceptive threat predictors (ΔCS+SOM and ΔCS+AUD, respectively), despite comparable contingency awareness between modalities (Table 2). This indication of enhanced conditioned behavioural responses to interoceptive threat predictors was supported by significant time × modality rmANOVA interaction effects in both studies (study 1: F(1,41)=16.71; P < 0.001; ηp2 = 0.29, Supplementary Table S4; study 2: F(1,22) = 8.19; P = 0.009; ηp2 = 0.27, Supplementary Table S5), and between-modality comparisons revealing significantly enhanced conditioned negative valence post-acquisition for ΔCS+VISC (versus ΔCS+SOM: t(41) = 4.19; P < 0.001; d = 0.59; Fig. 2a and Supplementary Table S6; versus ΔCS+AUD: t(22) = 3.28; P = 0.007; d = 0.37; Fig. 2b and Supplementary Table S7; as well as by enhanced SCR for CS+VISC in a subset of participants in study 2, see Supplementary Fig. S1). At the neural level, threat predictors induced shared differential activation in highly overlapping brain regions across studies, including aINS, hippocampus, MCC, dACC as revealed by conjunction analysis (in study 1: ΔCS+VISC ∩ ΔCS+SOM; in study 2: ΔCS+VISC ∩ ΔCS+AUD; full results in Supplementary Table S8). Interestingly, differential neural responses in pINS and MCC were enhanced for ΔCS+VISC compared to both ΔCS+SOM (Fig. 2c), as well as compared to ΔCS+AUD (Fig. 2d), a finding that was strikingly similar between both studies (PFWE < 0.05, Table 3). Of note, these findings were not appreciably altered when considering post-acquisition differences in US ratings as covariates of no interest (Supplementary Tables S9 and S10). Moreover, exploratory correlational analyses revealed significant correlations between CS and US valence, as well as between differential CS- and US-induced activations in peak activations in the insula and cingulate cortices (Supplementary analyses).

Table 2 Contingency awareness across learning phases.
Fig. 2: Behavioural and neural responses to cues (CS) predicting interoceptive versus exteroceptive threats during acquisition in studies 1 and 2.
figure 2

Conditioned predictors of interoceptive threat (ΔCS+VISC) acquired significantly greater negative valence than conditioned predictors of exteroceptive threats after acquisition a study 1, compared to ΔCS+SOM: ***P < 0.001; b study 2, compared to ΔCS+AUD: **P < 0.01; results of Bonferroni-corrected paired t-tests between modalities, full details in Supplementary Tables S4S7. Individual delta (Δ) scores were computed for differential CS valence of each CS+ relative to the CS. Data are presented as individual data points, boxplots, and densities (raincloud plots117). At the neural level, interoceptive threat predictors (ΔCS+VISC) induced enhanced differential neural responses in MCC and pINS compared to both exteroceptive threat predictors c study 1, ΔCS+VISC > ΔCS+SOM; d study 2, ΔCS+VISC > ΔCS+AUD; all PFWE < 0.05; details in Table 3, but threat predictors also induced shared differential activation in overlapping brain regions across studies (full results in Supplementary Table S8). For whole-brain results, see Supplementary Fig. S5. Neural activations in regions of interest were superimposed on a structural T1-image and thresholded at P < 0.001 uncorrected for visualisation purposes; colour bars indicate t-scores. AUD auditory, CS conditioned stimuli, FWE family-wise error, MCC midcingulate cortex, pINS posterior insula, SOM somatic, VAS visual analogue scale, VISC visceral.

Table 3 Differences in neural activation induced by cues (CS) predicting interoceptive versus exteroceptive threats during acquisition.

Conditioned stimuli: extinction phase

Repeated presentations of all CS without any US-induced extinction of conditioned negative valence of threat predictors (for significant rmANOVA time effects observed in both studies, see Supplementary Tables S4 and S5). However, this was shaped by CS type, as supported by significant time × modality interaction effects (study 1: F(1,41) = 5.51; P = 0.024; ηp2 = 0.12; study 2: F(1,22) = 6.36; P = 0.019; ηp2 = 0.22). In study 1 with an immediate extinction and a small number of extinction trials, post-extinction negative valence remained significantly greater for interoceptive compared to exteroceptive threat predictors (ΔCS+VISC: 35.38 ± 7.2 mm, ΔCS+SOM: 9.38 ± 7.0 mm; t(41) = 3.42; P = 0.003; d = 0.57). Contingency ratings indicated no differences between modalities, but an overestimation of true contingencies (CS+VISC: 24.2 ± 5.0%, CS+SOM: 25.0 ± 5.1%, with de facto 0% during extinction; Table 2). On the other hand, in study 2 with extinction accomplished 24 h after acquisition and a larger number of extinction trials, no mid- or post-extinction difference in negative valence between interoceptive and exteroceptive cues was detectable (ΔCS+VISC: 5.00 ± 4.0 mm, ΔCS+AUD: 3.00 ± 2.8 mm; P > 0.99; d = 0.12; for mid-extinction results, see Supplementary Table S7), and contingency ratings were widely accurate, with no differences between modalities (Table 2). At the neural level, we observed shared differential neural activation induced by interoceptive and exteroceptive threat predictors (i.e. ΔCS+VISC ∩ ΔCS+SOM and ΔCS+VISC ∩ ΔCS+AUD, respectively; full results in Supplementary Table S11) within several ROIs, including hippocampus in both studies, as well as vmPFC, pINS, and amygdala (study 1, Supplementary Fig. S2a) and aINS, and dACC (study 2, Supplementary Fig. S2b). No differences between modalities in differential CS-induced neural activation were found in any ROI in either study. However, whole-brain analyses revealed differences in the temporal gyrus (Supplementary Table S12).

Conditioned stimuli: reinstatement-test phase

In the reinstatement-test phase, unpaired CS presentations were implemented immediately after single threat reinstatement in study 1 [i.e. unexpected exposure to only USVISC in one subgroup (N = 22), and to only USSOM in another subgroup (N = 20)], or after multiple threat reinstatement (i.e. unexpected exposure to both USVISC and USSOM in all participants) in study 2. After single threat reinstatement (USVISC (N = 22); USSOM (N = 20)) in study 1, a significant time × modality × group interaction was observed for negative CS valence (F(1,1,40) = 5.56; P = 0.023; η2 = 0.12, see Supplementary Table S4). After multiple threat reinstatement with both USVISC and USSOM in study 2, the interaction effect was not significant (P = 0.055; ηp2 = 0.16, full results in Supplementary Table S5). Planned between-modality comparisons for the POST RST-TEST time point (Supplementary Tables S6 and S7) revealed greater differential negative valence of interoceptive compared to exteroceptive threat predictors in those groups involving USVISC exposure during reinstatement (study 1, USVISC-subgroup, ΔCS+VISC versus ΔCS+SOM, t(21) = 2.73; P = 0.025; d = 0.58; Fig. 3a; study 2, ΔCS+VISC versus ΔCS+AUD, t(22) = 2.49; P = 0.042; d = 0.73; Fig. 3c), whereas no difference was observed in the USSOM-subgroup (study 1, ΔCS+VISC versus ΔCS+SOM, P = 0.355; d = 0.21; Fig. 3b). Exploratory within-group comparisons (PRE-POST) for study 1 showed no significant changes for ΔCS+VISC and ΔCS+SOM (both P > 0.05) in the USVISC-subgroup, whereas in the USSOM-subgroup ΔCS+VISC significantly decreased (P = 0.005 uncorrected, Supplementary Table S6). The same comparisons for study 2 (PRE-POST) revealed a significant increase for ΔCS+VISC (P = 0.038 uncorrected), and no change for ΔCS+AUD (P > 0.05) (Supplementary Table S7).

Fig. 3: Behavioural and neural responses to cues (CS) predicting interoceptive versus exteroceptive threats during reinstatement-test in studies 1 and 2.
figure 3

After reinstatement with unexpected US, negative valence was greater for interoceptive compared to exteroceptive threat predictors in reinstatement groups involving USVISC a study 1, USVISC-subgroup, ΔCS+VISC versus ΔCS+SOM; c study 2, ΔCS+VISC versus ΔCS+AUD; *both P < 0.05, results of Bonferroni-corrected paired t-tests between modalities, full results in Supplementary Tables S4S7, whereas no difference was observed in the USSOM-subgroup b study 1, ΔCS+VISC versus ΔCS+SOM). Individual delta (Δ) scores were computed for differential CS valence of each CS+ relative to the CS. Data are presented as individual data points, boxplots, and densities (raincloud plots117). At the neural level, differential activation induced by interoceptive cues was enhanced in the USVISC-subgroup within dACC and pINS during reinstatement-test d study 1, ΔCS+VISC compared to ΔCS+SOM, all PFWE < 0.05). While no differential activation was observed for ΔCS+VISC compared to ΔCS+SOM in the USSOM-subgroup in study 1 or compared to ΔCS+AUD in study 2, shared differential neural activation was induced by both threat predictors in regions of interest, such as in the insula and cingulate cortex e ΔCS+VISC ∩ ΔCS+SOM in study 1, USSOM-subgroup; f ΔCS+VISC ∩ ΔCS+AUD in study 2. For full results, see Tables 46. For whole-brain results on differential activation, see Supplementary Fig. S7. Neural activations in regions of interest were superimposed on a structural T1-image and thresholded at P < 0.01 uncorrected for visualisation purposes; colour bars indicate t-scores. aINS anterior insula, AUD auditory, CS conditioned stimuli, dACC dorsal anterior cingulate cortex, FWE family-wise error, MCC midcingulate cortex, pINS posterior insula, SOM, somatic, US unconditioned stimuli, VAS visual analogue scale, VISC visceral.

At the neural level, threat predictors induced shared differential neural activation within hippocampus and vmPFC in the USVISC-subgroup (Table 4), and in the pINS in the USSOM-subgroup (study 1, ΔCS+VISC ∩ ΔCS+SOM, Fig. 3e, Table 5). After multiple threat reinstatement in study 2, threat predictors induced shared activation in the hippocampus, aINS, pINS, dACC, and MCC (ΔCS+VISC ∩ ΔCS+SOM, Fig. 3f, Table 6). Differential activation induced by interoceptive versus exteroceptive threat predictors was enhanced within pINS (L: x = −38, y = −20, z = 0, t(21) = 5.23, PFWE = 0.004, R: x = 44, y = −4, z = −4, t(21) = 4.91, PFWE = 0.007) and dACC (L: x = −4, y = 16, z = 22, t(21) = 4.55, PFWE = 0.023) only in the USVISC-subgroup (study 1, ΔCS+VISC compared to ΔCS+SOM, Fig. 3d, Table 4). In contrast, other groups revealed no differences between modalities in differential neural activation induced by interoceptive compared to exteroceptive threat predictors at all (Tables 5 and 6).

Table 4 Neural activation induced by cues (CS) predicting interoceptive versus exteroceptive threats during reinstatement-test after visceral single threat reinstatement in study 1.
Table 5 Neural activation induced by cues (CS) predicting interoceptive versus exteroceptive threats during reinstatement-test after somatic single threat reinstatement in study 1.
Table 6 Neural activation induced by cues (CS) predicting interoceptive versus exteroceptive threats during reinstatement-test after multiple threat reinstatement in study 2.

Discussion

Adaptive human behaviour in complex environments with multiple threats is guided by evolutionary-driven survival strategies that are preserved across species. In the face of imminent threat, learning from experience is particularly fundamental to the ability to identify and remember predictors of danger to facilitate avoidance or escape. As a translational model at the interface of psychology and the neurosciences, Pavlovian conditioning has proven valuable to elucidating behavioural and neural mechanisms underlying conditioned fear during the expectation of threat49,50,51, with widely appreciated clinical implications for anxiety and stress-related disorders1. Herein, we broadened the scope to unravel learning and memory processes underlying negative expectations in the face of multiple threats from different sensory modalities, with a particular focus on the pain. Pain is a ubiquitous and highly salient threat and a crucial part of the organism’s survival system that evokes strong adaptive responses, including cognitive and emotional processes orchestrated within the brain. These guide behaviour not only in response to actual pain experience52, but more importantly also during pain expectation8,53,54. Interoceptive, visceral pain appears to be particularly threatening4,5, engages partly distinct neural representations37, and may have a specific functional role in shaping brain dynamics27. Given the evolutionary significance of aversive signals originating from within our bodies, interoceptive conditioning could evoke greater and more persisting conditioned responses relevant to nocebo mechanisms underlying hypervigilance and hyperalgesia.

In two independent fMRI studies, we implemented specific, yet complementary differential conditioning paradigms to elucidate the acquisition and extinction of conditioned responses to predictors of interoceptive and exteroceptive threats. Experimental visceral pain as clinically-relevant interoceptive US was significantly more unpleasant when compared to the exteroceptive US from two sensory modalities, i.e. exteroceptive somatic pain in study 1 and aversive tone in study 2, despite careful matching to intensity and unpleasantness, respectively, supporting possible differences in habituation processes4. In addition to shared neural activation induced by US of different modalities, interoceptive US interestingly evoked greater neural activation within the anterior insula and dorsal anterior cingulate cortex as key regions of the salience network, with well-established roles in the central integration of interoceptive sensory signals with emotional and cognitive facets55,56,57. These findings, which appeared robust even when considering differences in US ratings as nuisance variables, reproduce and complement earlier efforts to elucidate the specificity of interoceptive visceral pain in shaping aversive anticipation and central pain processing, not only in direct comparison to an exteroceptive painful threat4,26,58, but also to a non-nociceptive, yet a priori equally aversive auditory threat. In line with a notable recent publication detailing a multivariate brain measure, the Neurologic Pain Signature (NPS), for visceral and somatic stimulation across different independent datasets37, our results from two independent studies support the unique salience of interoceptive pain as a US, above and beyond specific yet highly intertwined perceptual characteristics of intensity and unpleasantness, and underscore the suitability of this experimental model to elucidating the role of threat modality in associative learning and extinction processes in a clinically-relevant context.

To assess whether US modality distinctly shapes the formation of learned negative expectations, we accomplished analyses of differential conditioned responses to modality-specific threat predictors (CS). Results for the acquisition phases of both studies showed enhanced differential behavioural and neural responses to interoceptive threat predictors, suggesting preferential learning for the visceral modality. This was supported at the behavioural level by greater increases in the negative valence of interoceptive versus exteroceptive predictive cues, which were observed despite comparable contingency awareness, and greater visceral cue-induced SCR responses suggested by exploratory analyses of a subset of data. Within the brain, we documented shared differential activation to all threat predictors compared to safety cues in highly overlapping brain regions across studies, in line with a recent meta-analysis documenting a consistent and robust pattern of an ‘extended fear network’ across diverse fear conditioning paradigms50, as well as a meta-analysis supporting that pain-related and non-pain-related conditioned fear recruits overlapping but distinguishable neural networks24. Importantly, we also consistently demonstrated differences between interoceptive versus exteroceptive threat predictors in both studies. Specifically, conditioned interoceptive threat predictors induced greater differential neural activation in the posterior insula and midcingulate cortex. A recent meta-analysis focusing on pain anticipation supported an interplay of insular and cingulate regions in the representation of the affective qualities of sensory events, particularly applying to interoceptive signals59, in line with our own recent findings documenting the relevance of posterior insula in visceral compared to somatic pain expectation26. Given the well-established role of the posterior insula in restoring and maintaining homoeostasis in the face of imminent danger60, its distinct involvement may serve adaptive modulatory functions during the expectation of interoceptive threat. Our findings extend knowledge from other brain imaging studies on the modality-specific aversive expectancy that have compared predictors of somatic pain with aversive pictures31 or disgusting odours30, and complement our own data on nocebo effects and underlying mechanisms in visceral pain8,9,11,21. These observations suggest a specific relevance of insular together with cingulate regions in the preferential acquisition of the presumably more salient interoceptive CS–US association. In keeping with the notion that expectations dynamically influence perception and learning61,62, the reciprocal impact of interoceptive threats and their predictors is further substantiated by our exploratory correlational results. These not only indicate that affective qualities of interoceptive versus exteroceptive threats shape conditioned negative expectations, but also suggest a tight link between differential neural responses during the expectation and experience of aversive interoceptive signals. Together, our findings regarding the formation of negative interoceptive expectancies by aversive conditioning support our hypothesis that in the face of multiple danger signals indicating bodily harm, visceral pain evokes preferential interoceptive fear learning. These findings could be viewed as a modern replication of classical interoceptive conditioning studies carried out by soviet psychologists35, complemented herein by brain imaging techniques. They are in keeping with preparedness theory40, and support its applicability to pain-related learning in a broader context of the affective neurosciences, with intriguing putative clinical relevance. The role of fear and hypervigilance is increasingly appreciated in the pathophysiology and treatment of multiple complex and overlapping clinical conditions, including anxiety and chronic pain54. Modality-specific conditioning could therefore contribute to unravelling nocebo mechanisms relevant to vulnerability to chronicity and treatment failure, especially when nocebo effects persist rather than extinguish.

Persisting or resurging fear constitutes a core target of cognitive-behavioural treatment approaches like exposure therapy, which is essentially built on the successful and robust extinction of conditioned responses including learned fear. When the threat is no longer present, extinction of conditioned responses to former threat predictors is adaptive, allowing behavioural flexibility in rapidly changing, complex environments. At the same time, the initially acquired memory trace is preserved and can be dynamically reactivated63, which can contribute to impaired extinction efficacy and to relapse in clinical contexts64. This may be particularly relevant for highly salient and fear-evoking threats that are crucial to avoid, like interoceptive pain, as essentially already suggested by the Soviet pioneers of classical conditioning35. To elucidate threat modality-specific extinction processes and their underlying neural mechanisms, we tested whether the visceral CS–US association is more resistant to extinction and more susceptible to memory reactivation or ‘relapse’, induced by unexpected US exposure (i.e. reinstatement). In an effort to model different aspects of extinction learning, including the clinical reality of patients with waxing and waning symptoms, we herein implemented different experimental extinction and reinstatement protocols. Interestingly, when omission of the US occurred directly after acquisition on the same day in study 1, behavioural results indicated persisting conditioned fear in response to visceral but not somatic pain predictors, despite comparable contingency awareness. While these results may indicate a greater resistance to extinction specifically for the interoceptive CS–US association, a cautious interpretation is warranted given the small number of extinction trials and an overall overestimation of reported CS–US contingency awareness. In study 2, with an extinction phase accomplished on a subsequent study day and more extinction trials, conditioned behavioural responses were no longer evident to either predictive cue, not even in a supplementary analysis of a smaller number of extinction trials, and hence do not support a modality-specific resistance to extinction. However, given that our earlier conditioning work repeatedly documented rapid and full extinction in 1-day paradigms with visceral threats only20,65, together the present findings could hence also indicate that full extinction of conditioned emotional responses to multiple threats requires more unreinforced trials, especially for threats of higher salience and immediate extinction learning. In light of increasing knowledge regarding the role of consolidation and reconsolidation in the context of conditioned fear66,67, our divergent findings in studies 1 and 2 call for more mechanistic studies on the temporal dynamics and boundary conditions of pain-related extinction learning in multi-day and multi-threat paradigms, ideally including objective, physiological measures derived from electrodermal activity or pupillometry recordings. Regarding brain imaging results for the extinction phase, no differences were observed in neural responses to threat predictors of different modalities in either study. Instead, shared neural activation induced by both threat predictors during extinction was evident in both studies, involving key areas of the extinction network, particularly the hippocampus, supporting its general role in extinction learning68 irrespective of threat modality, or of the number and timing of extinction trials.

Although extinction efficacy is clearly relevant to chronicity and treatment failure in patients69,70, underlying mechanisms remain incompletely understood even in healthy individuals. Reinstatement constitutes a promising translational tool38, which has not been applied in human brain imaging studies in the context of multiple threats. While we implemented different reinstatement procedures in the two studies, all involved unexpected US exposure followed by unpaired cue presentations. This allowed us to analyse differential conditioned responses during a reinstatement-test phase for different (former) threat predictors as an indicator of extinction efficacy. Our behavioural findings provide at least partial support for the notion that extinction efficacy may be reduced for the interoceptive CS–US association, i.e. that reinstatement with the visceral US had a greater impact on differential responses to CS. After reinstatement involving unexpected exposure to the visceral US, either as a ‘single threat’ (USVISC-subgroup of study 1) or as a ‘multiple threat’ (all participants in study 2), we observed greater negative valence of former interoceptive versus exteroceptive threat predictors. On the other hand, reinstatement with the somatic US alone (USSOM-subgroup of study 1) did not induce differences between CS modalities. Our hypothesis is most clearly supported by the results of study 2, in which reinstatement induced a selective resurgence of the interoceptive CS–US association, consistent with a return of interoceptive fear. Interpretation of findings in study 1 is complicated by the fact that extinction was immediate and shorter, as explained above, and evidently did not lead to a complete resolution of conditioned responses. Herein, reinstatement can rather be conceptualised as a disruption of the ongoing extinction process, which would make a return (i.e. a de novo increase) of conditioned responses difficult to detect. While we did not plan for this, and results may be hampered by limited statistical power given relatively small sample sizes of reinstatement groups, applicability to real-life scenarios is intriguing: Unexpected and unsignaled threats like painful episodes can obviously occur at any time point during an ongoing extinction process. Understanding how easily such a process can be disrupted would hence inform our understanding of adaptive extinction learning with relevance to factors that may interfere with successful exposure-based treatment. Given these considerations, one interpretation of data from study 1 is that unexpected exposure to interoceptive threat disrupted or in fact halted the ongoing extinction of conditioned responses to interoceptive threat predictors, whereas the extinction process for the same predictors effectively continues after unexpected exposure to exteroceptive threat, as supported by within-modality comparisons. Interestingly, residual effects of cue aversiveness have been shown to predict a reinstatement effect in healthy volunteers71, consistent with evidence from a clinical setting72. Hence, residual fear responses after extinction may serve as a predictor of treatment outcome. In patients with persistent fear and/or pain, achieving a robust and sustained extinction of conditioned responses constitutes an important treatment goal, and maybe particularly challenging for interoceptive memory traces based on our data in healthy individuals. In other words, we may be primed to preferentially learn, store, and remember cues that signal internal harm.

Within the brain, after reinstatement with visceral US alone or in combination with the somatic US, shared differential neural responses to both threat-predictive cues were consistently observed in the hippocampus as a core region of the extinction network. This finding is well in line with earlier studies implementing only one threat modality73,74,75, including visceral pain in healthy individuals76 and in patients with chronic visceral pain77,78. In addition to the hippocampus, posterior insula and cingulate regions were differentially activated following reinstatement with visceral threats alone, a neural activation pattern that closely resembled neural responses detected during the acquisition, yet not observed during extinction. These enhanced differential responses within the insula and cingulate cortex during reinstatement-test may reflect the reactivation of the excitatory memory trace, presumably triggering preparatory responses in expectation of the reoccurrence of threat. Notably, single threat visceral reinstatement (study 1), resulted in enhanced differential neural responses for interoceptive compared to the exteroceptive threat predictors, whereas multiple threat reinstatement (study 2) led to shared differential activation of these regions to both interoceptive and exteroceptive threat cues. The latter finding may be explained by generalisation effects induced by the unexpected exposure to multiple threats in close temporal proximity. This may promote a generalisation of threat value from the more salient interoceptive to the less salient exteroceptive threat, ultimately resulting in the reactivation of neural responses to all former threat predictors regardless of their salience. While this is speculative, the inability to adequately differentiate between stimuli of different threat value, particularly when confronted with recent adversity, has previously been discussed as one mechanism contributing to enhanced relapse risk in clinical populations undergoing extinction-based treatment74. Together, these observations underline the critical importance of factors promoting either discrimination or generalisation of conditioned responses in the prediction and prevention of fear relapse79,80, extending the concept that as a result of conditioning, impaired discrimination of conditioned81 and unconditioned responses82 could play a role in the development of chronic pain.

Conditioned interoceptive fear should generally be considered adaptive, and an essential component of evolutionary-driven survival behaviour. However, when contextualised within a broader nocebo framework in which conditioned negative expectations drive maladaptive avoidance, hypervigilance and hyperalgesia8,83,84, putative clinical implications and future directions are noteworthy. Our findings support that interoceptive threat predictors may more readily evoke conditioned fear, which could drive the transition from acute to chronic pain as well as symptom chronicity, especially in vulnerable individuals. Furthermore, the risk for impaired extinction efficacy and relapse phenomena may be more pronounced in the context of aversive interoceptive signals, especially in combination with stress83,85, which demonstrably amplifies visceral nocebo effects86, and may contribute to a negative recall bias about aversive visceral experiences87. Given the evidence supporting altered extinction learning in patients with chronic pain, including IBS77,78, translational research in clinical populations is urgently needed. While our results are limited by the lack of complete SCR data as a biological marker of learning as well as by a more definite exclusion of pain-modality-specific vascular artefacts induced by gasping or other respiratory or movement-related effects that could be more closely inspected whether pulse oximetry or respiration had been measured, they do provide a more refined understanding of conditioned nocebo effects in the context of clinically-relevant interoceptive and exteroceptive threats. Merging our clinically-driven perspective with the rapidly expanding general literature on interoception and predictive processing provides opportunities for the development or refinement of computational models based on the precision of interoceptive versus exteroceptive signals, advancing not only the definition of salience itself but also clarification its mechanistic basis88,89,90,91,92,93. The translation of this knowledge may ultimately help understand and minimise negative expectancy effects in patients with chronic pain94,95, especially in disorders of gut–brain interactions, adds a brain perspective to the eloquent claim that the gut is ‘smart’ due to its capability to learn and remember96, and supports further efforts towards extinction-based treatment approaches for these highly prevalent conditions10,97.

Methods

Participants

For the purposes of this report, we analysed unpublished data from healthy volunteers who were recruited to serve as controls in two conceptually connected fMRI conditioning studies conducted within a collaborative research unit (SFB 1280 ‘Extinction Learning’, funded by the German Research Foundation). We utilised data from healthy volunteers recruited as part of a patient study (study 1), and included data from the placebo arm of a pharmacological study (study 2; German Clinical Trials Register, registration ID: DRKS00016706). Recruitment and screening of all healthy volunteers in both studies followed highly-standardised and established procedures in our line of visceral pain research4,86. An initial structured telephone screening was followed by a personal interview and a medical examination. Interview and examinations were accomplished in a medically-equipped room within a clinical research unit at the University Hospital Essen, Germany. Exclusion criteria common to both studies were <18 or >45 years of age, body mass index (BMI) < 18 or >30, and any known medical condition or regular medication use (except thyroid medication and hormonal contraceptives). The usual exclusion criteria for magnetic resonance imaging (MRI) applied, and structural brain abnormalities were ruled out upon structural MRI. Perianal tissue damage (e.g. haemorrhoids, fissures), which may interfere with rectal balloon distensions were excluded by digital rectal examination. Pregnancy was excluded with a commercially available urinary pregnancy test (Biorepair GmbH, Sinsheim, Germany) on the day of the experiment. Prior participation in any previous or other ongoing studies involving pain-related conditioning was also exclusionary. Standardised questionnaires were used to screen for recent gastrointestinal complaints98, symptoms of depression or anxiety (Hospital Anxiety and Depression Scale, HADS)99, as well as to confirm right-handedness100,101. As part of a comprehensive psychosocial questionnaire battery, chronic perceived stress was also assessed (Trier Inventory of Chronic Stress, TICS)102. All the participants reported normal hearing and normal or corrected-to-normal vision. The work was conducted in accordance with the Declaration of Helsinki, and studies were approved by the ethics committee of the University Hospital Essen (protocol numbers 10–4493 and 16–7237), and followed the relevant ethical guidelines and regulations. All volunteers provided written informed consent and were paid for their participation.

Overview of study designs and procedures

To elucidate the formation and extinction of conditioned responses to threat-predictive cues (conditioned stimuli, CS) in the face of multiple different biologically-salient threats, we implemented two differential delay conditioning studies with visual CS+ predicting interoceptive threat (visceral pain: USVISC) or exteroceptive threat (study 1: somatic thermal pain, USSOM; study 2: aversive auditory stimulus, USAUD, study 2), and unpaired CS. All experimental procedures were conducted in the MRI-suite of the Institute of Diagnostic and Interventional Radiology and Neuroradiology at the University Hospital Essen, Germany. In both studies, perceptual thresholds for each US modality were initially assessed, and individual US stimulus intensities for implementation during conditioning were identified with a calibration and matching procedure (for details, see below). During conditioning, both studies implemented the same sequence of experimental learning phases, namely acquisition, extinction, and reinstatement-test phases (Fig. 4, details below). In study 1, all phases were accomplished consecutively on a single study day, whereas in study 2, extinction and reinstatement-test phases were implemented 24 h after acquisition. During all learning phases, fMRI was applied to assess shared and differential neural activation induced by US and CS. US- and CS-related behavioural measures were acquired for each phase using digitised visual analogue scales (VAS). Moreover, electrodermal activity was continuously recorded aiming for analysis of skin conductance responses (SCR) as a psychophysiological measure of learning using an MRI-compatible system (Biopac Systems, Inc., Goleta, CA, USA; MP100 in study 1, MP160 in study 2), but technical difficulties resulted in incomplete data. Results of exploratory analyses of skin conductance responses to CS for a subset of participants in studies 1 and 2 are provided as Supplementary Material (Supplementary Fig. S1). Note that participants were informed that the study goals were to investigate neural mechanisms underlying visceral pain-related learning and memory processes. Importantly, no detailed information was provided about experimental phases, or about the contingencies between CS and US.

Fig. 4: Schematic overview of study designs.
figure 4

All participants in studies 1 and 2 underwent acquisition (ACQ), extinction (EXT), and reinstatement-test (RST-TEST) phases. As for unconditioned stimuli (US), visceral pain (USVISC) and either equally painful somatic pain (study 1, USSOM), or equally-unpleasant auditory stimuli (study 2, USAUD) were implemented during acquisition (ACQ) and reinstatement (RST). As conditioned stimuli (CS), distinct visual geometrical symbols were paired with US (CS+VISC; CS+SOM; CS+AUD) or were presented without US (CS) during acquisition (differential delay conditioning). All CS were presented without US during EXT and RST-TEST. RST procedures involved unsignalled US from one modality (‘single threat reinstatement’ in study 1: USVISC in one subgroup; USSOM in another subgroup) or from both modalities (multiple threat reinstatement in study 2: USVISC and USAUD in all participants). During all phases, functional magnetic resonance imaging (fMRI) was accomplished to assess shared and differential CS- and US-induced neural activation in regions of interest. Before and after each phase, behavioural measures were acquired with visual analogue scales (VAS).

Unconditioned stimuli (US)

For interoceptive US (USVISC), applied in both studies, pressure-controlled rectal distensions were carried out with a barostat system (modified ISOBAR 3 device, G & J Electronics, Toronto, ON, Canada). Graded distensions of the rectum with an inflatable balloon constitute a well-established experimental model to assess visceroception and visceral pain, especially in the context of IBS103. The distension model allows the controlled and finely-tuned application of distensions inducing mild, intermediate or strong sensations of urgency, discomfort, and pain that closely resemble aversive visceral sensations experienced by patients, which are also commonly but less frequently experienced by healthy persons. Building on our long-standing experimental expertise with different sensory modalities4,47,48,86, for the exteroceptive US, cutaneous thermal stimuli (USSOM) were applied on the left ventral forearm with a thermode (PATHWAY model CHEPS; Medoc Ltd. Advanced Medical Systems, Ramat Yishai, Israel) in study 1. In study 2, an aversive tone (USAUD) with a saw-tooth waveform profile and a frequency of 1 kHz created using Audacity 1.3.10-beta (http://www.audacity.sourceforge.net/) was presented by an MRI-compatible sound system (Amplifier mkll+S/N 2016-2-2-03, MR confon GmbH, Magdeburg, Germany) bi-aurally through headphones. Individual perceptual thresholds for US are reported herein only for the purpose of descriptive characterisation of the study samples, as they primarily served as anchors for US calibration and matching.

In a continuation of our previous work on specificity to pain modality4,26,86, USVISC and USSOM were matched to perceived pain intensity in study 1. In study 2, USVISC were compared to a non-nociceptive, yet equally aversive USAUD by matching to perceived unpleasantness. In both studies, visceral stimuli served as an anchor for calibration and matching of exteroceptive US, aiming to identify individual US stimulation intensities within a predefined perceptual range of 60–80 mm (assessed on 0–100 mm VAS, ends labelled not painful and extremely painful in study 1, and not unpleasant and very unpleasant in study 2). To this end, a distension pressure 5 mmHg below the individual rectal pain threshold was initially chosen and rated on VAS until pressure within the predefined range was identified. This was successfully accomplished in both studies (VAS ratings for USVISC: 70.1 ± 0.9 mm in study 1; 65.8 ± 3.6 mm in study 2). For matching, visceral stimuli were presented with thermal (study 1) or auditory stimuli (study 2), respectively, and participants were prompted to compare the stimuli on a response device with Likert-type response options indicating more, less, or equally painful (study 1) or unpleasant (study 2) stimuli. If the rating showed a deviation, the intensity of exteroceptive stimuli was successively adjusted until ratings indicated equal perception at least twice consecutively. Of note, stimulus durations for interoceptive and exteroceptive US were adjusted for each individual, aiming at matched durations of ascending and plateau phases of US stimulation (20 s study 1; 14 s study 2). For additional details, see Supplementary Methods.

Note that after matching, in study 1 a short adaptation phase was accomplished in the MR scanner to accommodate for possible habituation effects previously observed for thermal pain stimuli4. This involved a short series of unsignaled US presentations (i.e. five visceral and five heat pain in pseudorandomized order), followed by another matching procedure when necessary. Supplementary control analysis of movement data (linear and degree movement) was accomplished for the unsignaled US delivered in the habituation phase conducted prior to the acquisition, in order to explore possible pain-modality-specific movements (e.g. due to gasping) potentially confounding differential neural activation in subsequent experimental phases (Supplementary Fig. S3).

Based on these careful matching procedures, the following stimulus intensities for implementation during acquisition and reinstatement were identified: For USVISC distension pressures, 34.7 ± 1.6 mmHg in study 1, 35.9 ± 1.8 mmHg in study 2; for USSOM thermode temperature, 45.1 ± 0.3 °C; for USAUD loudness, 94.74 ± 1.18 dB SPL (range: 89–108 dB SPL), all in line with our earlier results involving the application of the same pain4,86 or auditory stimuli47,48.

Experimental phases

During acquisition, both studies involved three distinct conditioned stimuli (CS), which were contingently paired with the interoceptive US (CS+VISC) and one exteroceptive US (CS+SOM or CS+AUD, respectively), or remained unpaired (CS). Visual geometric symbols served as CS, and allocation of a specific CS symbol to a specific US (USVISC, USSOM, or USAUD) or designation as CS was counterbalanced across participants. Within each study, participants were pseudo-randomly assigned to a different order sequence of CS–US pairings to avoid potential sequence effects. The programming of pairings aimed for an essentially pseudorandomized order, but avoided more than two successive pairings of one modality, and ensured that sequences alternatingly started with an interoceptive or exteroceptive CS–US pairing. All acquisition sequences were also balanced for the number of CS+ and CS presentations. The number of CS presentations and reinforcement schedules were similar in the two studies (study 1: 10 presentations per CS, 8 CS–US pairings, 80% reinforcement; study 2: 12 presentations per CS, 10 CS–US pairings, 83% reinforcement). All CS+ were presented 6–12 s before US, with CS and US co-terminating. Inter-stimulus intervals consisted of a black screen with a white frame (durations: 5–8 s).

During extinction and reinstatement-test phases, CS was presented without any US. Given earlier evidence of rapid extinction for CS–USVISC associations20,65 and to ensure tolerability of total scanning time for participants in study 1 (all phases accomplished consecutively), the number of extinction and reinstatement-test trials, respectively, was lower (5 presentations per CS) than in study 2 (12 presentations per CS). Subsequent to the extinction phase, reinstatement procedures involving the unsignaled and unexpected re-exposure to the US were accomplished, followed by a reinstatement-test phase, consisting of the same number of CS presentations as during extinction in pseudorandomized order. Given a lack of human reinstatement studies involving the multiple US, and unresolved methodological challenges in the field38, we implemented different reinstatement procedures in an effort to provide procedure-specific, yet complementary data. In study 1, participants were pseudo-randomly assigned to subgroups undergoing a reinstatement procedure with either interoceptive US alone (4 USVISC, N = 22) or the exteroceptive US alone (4 USSOM, N = 20). By doing so, we aimed to test for reinstatement effects after unexpected re-exposure to threat from one modality (single threat reinstatement) within each reinstatement subgroup. In study 2, all participants (N = 23) underwent a reinstatement procedure with both interoceptive and exteroceptive US (3 USVISC, 3 USAUD), aiming to test for reinstatement effects after unexpected re-exposure to threats from multiple modalities (multiple threat reinstatement). Note that the number of unexpected US presentations was chosen based on work in the field38 and our own earlier studies20,77,104. All US intensities and durations implemented as part of reinstatement procedures were identical to those applied during acquisition.

Behavioural measures

For the purposes of concise and parallelised US- and CS-related behavioural and neural analyses across different sensory modalities in two independent studies, we focussed our behavioural data analysis on unpleasantness ratings as a clinically-relevant indicator of emotional valence. Emotional valence is relevant to all types of threat, shapes the perception of aversive stimuli, including pain105, and drives threat-related behaviours like approach and avoidance106. It is highly relevant to the specificity of visceral pain26,42, and sensitive to modulation by placebo/nocebo mechanisms8,10,107,108. Prior pain-related conditioning studies from our own group (reviewed in refs. 8,109) and in the broader fear conditioning literature support the notion that conditioned changes in cue valence constitute a sensitive and relevant behavioural measure capturing the formation, as well as the extinction and return of fear responses in healthy adults110 and clinical populations106.

All ratings were accomplished on digitised VAS in the scanner using an MRI-compatible hand-held fibre optic response system (LUMItouchTM, Photon Control Inc., Burnaby, BC, Canada) before and after learning phases (for specific assessment time points, see Fig. 4; note that in study 2, an additional VAS rating was accomplished mid-extinction (after 6 trials), which we report on in the Supplementary Table S7). In study 1, VAS anchors were labelled ‘very pleasant’ (−100 mm) and ‘very unpleasant’ (+100 mm), with the word ‘neutral’ (0 mm) marked in the middle of the digitised VAS, as accomplished in our previous conditioning work involving painful US20,21,65,77,86,111. In study 2, VAS anchors were labelled ‘not at all unpleasant’ (0 mm) and ‘very unpleasant’ (100 mm), consistent with our prior work across sensory modalities47,48. For the purposes of this report and in light of differing scales, we exclusively analysed differential CS valence, computed as individual delta (Δ) scores for each CS+ relative to the CS for each learning phase and within each study group. This allows phase-specific comparisons of ΔCS+VIS vs. ΔCS+SOM in study 1 and ΔCS+VIS vs. ΔCS+AUD in study 2, in keeping with contrasts computed for brain imaging analyses (see below). Note that we additionally acquired perceived intensity of USVISC and USSOM in study 1; for findings dedicated to elucidating the contributions of intensity versus unpleasantness in the context of visceral pain specificity, see our earlier work4,26 and Supplementary analyses herein using these ratings as a covariate of no interest for fMRI data analyses (Supplementary Tables S12, 9-10-S9).

To elucidate cognitive awareness of the specific CS–US pairings for each phase, we report contingency awareness as a secondary behavioural measure. To this end, at the conclusion of each experimental phase, for each CS a VAS with ends labelled ‘never’ (0 mm) and ‘always’ (100 mm) assessed the perceived probability (0–100%) of a US following a specific CS, as previously described20,77,104. Note that we herein report contingency awareness with a focus on differences between modalities. Statistical analyses of the accuracy of CS–US associations require more complex computations (for an approach, see ref. 112), which is beyond the scope herein.

Statistical analyses and reproducibility of behavioural data

Statistical analyses of behavioural data were accomplished separately for each study using IBM SPSS Statistics for Windows, version 20 (IBM Corp., Armonk, N.Y., USA). For ΔCS valence, 2 × 2 repeated-measures analyses of variance (rmANOVA) with the factors time (pre, post) and modality (interoceptive, exteroceptive) were computed for each experimental phase, applying the Greenhouse-Geisser correction when the assumption of sphericity was violated. Given our hypotheses and to ensure readability, we provide statistical details on time × modality interaction effects in the main manuscript; full rmANOVA results including all main and interaction effects are given in Supplementary Tables (Supplementary Tables S4 and S5). Two-tailed paired t-tests were computed as planned comparisons for two purposes: (1) To test for hypothesis-driven differences between modalities in ΔCS valence, US valence, and contingency awareness at specific time points within studies and experimental groups (all results reported in the main manuscript; further details in Supplementary Tables S6 and S7); and (2) to explore differences in ΔCS valence within modalities across time points (PRE-POST; full results reported in Supplementary Tables S6 and S7). Only Bonferroni-corrected P-values are reported within the main manuscript; full uncorrected results of all paired t-tests are provided in Supplementary Tables S6 and S7. For rmANOVA, effect sizes are reported as partial eta squared (ηp2); for t-tests, effect sizes are provided based on Cohen’s d for correlated designs113. Correlational analyses were accomplished using Pearson’s r. Results are reported as mean ± standard error of the mean (SEM).

All data are expected to be reproducible given the same settings and procedures as described herein.

Brain imaging data acquisition and analyses

All MR images were acquired using a whole-body 3 Tesla scanner (Skyra, Siemens Healthcare, Erlangen, Germany) equipped with a 32-channel head coil. For functional imaging, single-shot echo-planar imaging (EPI) sequences with similar settings were used (identical for both studies: TE 28.0 ms, flip angle 90°, GRAPPA r = 2 with 38 transversal slices angulated in the direction of the corpus callosum, slice thickness of 3 mm, slice gap 0.6 mm, voxel size 2.3 × 2.3 × 3.0 mm; study 1: TR 2300 ms, FOV 220 × 220 mm2, matrix 94 × 94 mm2; study 2: TR 2400 ms, FOV 240 × 240 mm2, matrix 104 × 104 mm2). Structural images were acquired prior to functional imaging using the same T1-weighted 3D-magnetisation prepared rapid gradient echo (MPRAGE) sequence in both studies [repetition time (TR) 1900 ms, echo time (TE) 2.13 ms, flip angle 9°, field of view (FOV) 239 × 239 mm2, 192 slices, slice thickness 0.9 mm, voxel size 0.9 × 0.9 × 0.9 mm3, matrix 256 × 256 mm2, Generalised Partially Parallel Acquisitions (GRAPPA) r = 2].

Functional images were analysed with SPM software (SPM12, Wellcome Trust Centre for Neuroimaging, UCL, London, UK) implemented in Matlab (R2016b, Mathworks Inc., Sherborn, MA, USA). A standard realignment procedure as implemented in SPM12 was performed for the estimation of six parameters for translation (linear: x, y, z (mm)) and for rotation (degree: pitch, roll, yaw (°)) to describe the rigid body transformation between each image and a reference image. Subsequently, functional images were co-registered to individual T1-weighted structural images used as reference images, with the origin set to the anterior commissure. Functional images were normalised to Montreal Neurological Institute (MNI) space using a standardised International Consortium for Brain Mapping (ICBM) template for European brains as implemented in SPM12, and smoothed using an isotropic Gaussian kernel of 8 mm. To correct for low-frequency drifts, a temporal high-pass filter with a cut-off set at 128 s was implemented. Serial autocorrelations were taken into consideration by means of an autoregressive model first-order correction.

First-level analyses were performed using a general linear model applied to the EPI images. The time series of each voxel was fitted with a corresponding task regressor that modelled a box car convolved with a canonical hemodynamic response function (HRF). As regressors, CS type (CS+VISC; CS+SOM/CS+AUD; CS) and US modality (USVISC; USSOM/USAUD, only in analyses of acquisition phases) were included. For analyses of CS-induced activations, durations were used exactly as implemented in the experiments (jittered between 6 and 12 s before US presentation), for analyses of US-induced activations, ascending and plateau phases of US stimulation were included in analyses (20 s in study 1, 14 s in study 2). Six realignment parameters for translation and rotation were additionally implemented as multiple regressors for motion correction. After model estimation, the following first-level contrasts and respective reverse contrasts were computed for analyses of differential CS-related and US-related neural responses separately for each study group: CS+VISC > CS, CS+SOM > CS, USVISC > USSOM for study 1; CS+VISC > CS, CS+AUD > CS, USVISC > USAUD for study 2. CS contrasts were computed for each phase, US contrasts only for the acquisition phase.

On the second level, for analyses of US-induced neural activation, one-sample t-tests based on these differential first-level contrasts and paired t-tests were calculated. For analyses of CS-induced differential neural activation, paired t-tests were computed for each experimental phase to compare ΔCS+VISC versus ΔCS+SOM in study 1 ({CS+VISC < CS} > {CS+SOM < CS}; {CS+VISC > CS} > {CS+SOM > CS}) and ΔCS+VISC versus ΔCS+AUD in study 2 ({CS+VISC < CS} > {CS+AUD < CS}; {CS+VISC > CS} > {CS+AUD > CS}). Additional exploratory analyses included selected covariates of no interest, as indicated in the results. Further, extending our earlier findings revealing not only distinct but also shared neural activations for US4 as well as CS26 across modalities, conjunction analyses using first-level contrasts were carried out to identify joint activations (i.e. CS+VISC > CS ∩ CS+SOM > CS, USVISC ∩ USSOM for study 1; CS+VISC > CS ∩ CS+AUD > CS, USVISC ∩ USAUD for study 2). Conjunction analyses were computed (a) using the minimum statistic to the conjunction null to test for shared activation within all tested subjects, and (b) using the minimum statistics to the global null to test for shared activation within some subjects114,115. For correlational analyses exploring associations between differential CS and differential US activation in specific ROI (provided in Supplementary analyses), parameter estimates were extracted for peak-voxels in significant regions of interest (ROIs) as identified by one-sample t-tests.

All analyses focused on a priori defined ROIs of the salience and extinction networks26,51,55,56,57,66,68, including the insula (anterior, aINS; posterior, pINS), subregions of the cingulate cortex (midcingulate cortex, MCC; dorsal anterior cortex, dACC), amygdala, hippocampus, and ventromedial prefrontal cortex (vmPFC). All ROI analyses were carried out using unilateral anatomical templates constructed from the WFU Pick Atlas (Version 2.5.2), as implemented in SPM12. Segmentation of the insula (aINS, pINS) and cingulate cortex (dACC, MCC) was accomplished with masks based on the previous literature116 within the borders of the Wake Forest University (WFU) Pick Atlas. For all reported ROI analyses, family-wise-error (FWE) correction for multiple testing was used with statistical significance set at PFWE < 0.05, and coordinates refer to the MNI space. Supplementary whole-brain analyses (uncorrected P < 0.001) were additionally carried out (Tables 1, 36; Supplementary Tables S13, S812; Supplementary Figs. S4S7 for visualisation). Note that the results presented within the main manuscript all focus on ROI analyses unless explicitly specified otherwise.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.