Introduction
Decision-making is a cognitive process which consists of choosing one option among several alternatives. It progresses from the exploration of unknown options to the exploitation of preferred ones (de Visser et al.
2011a,
b,
c). During this cognitive process, the decision maker evaluates the value of each option regarding his/her own preferences and the probability to get it which will bring him/her to choose one strategy instead of another one. Such strategies are featured in the Iowa gambling task (IGT) (Bechara et al.
1994), a decision-making task that mimics real life situations by reproducing uncertain conditions based on probabilistic rewards or penalties (Bechara et al.
1994). During this task, subjects have to implicitly discover over time which option is advantageous in the long term, with the discovery that these options are not available under fixed and predictable contingencies. Two categories of behaviors are usually observed: a main one which consists of choosing advantageous options in the long term, and less frequent ones which do not (Bechara et al.
1999,
2002). Using a variant version of the IGT in a healthy population, Bechara et al. (
2001,
2002) evidenced the existence of extreme strategies and of a Gaussian distribution of performance.
One of these two extreme strategies observed in a small proportion of healthy subjects is often reinforced in some psychopathological situations in which alteration of prefrontal networks is a hallmark, such as schizophrenia (Brown et al.
2015), depression (Cella et al.
2010), pathological gambling (Clark et al.
2013), or addiction (Balconi and Finocchiaro
2015). Furthermore, adolescents with disruptive behavior disorders and vulnerability for addiction more frequently show risky decision-making (Schutter et al.
2011) and addicted adult patients are more focused on reward which changes their internal state and inner sensation (Paulus and Stewart
2014). It has also been shown that anxious subjects are more likely to focus on internal body-centered cues than on environmental cues (Galván and Peris
2014) and thus are less likely to adapt to changing environments (Robinson et al.
2015). Altogether, it suggests that inter-individual traits are associated to specific strategies during decision-making tasks likely mediated by defective prefrontal cortex activation and/or defective monoaminergic innervations.
Decision-making processes require coordinated activity of multiple brain networks, especially those involving the prefrontal cortex (PFC) (Li et al.
2010). Furthermore, interaction of a limbic loop (affective/emotion) and a cognitive loop (executive/motor) is necessary for adapted decision-making (de Visser et al.
2011a,
b,
c; Koot et al.
2013). In case of loss after high risk choice, healthy subjects exhibit enhanced PFC activation, whereas anxious subjects exhibit enhanced activation of amygdala and insula (Van den Bos et al.
2013). In addition, prefrontal dopamine levels depend on the emotional content of the decision-making task (Parasuraman et al.
2012) and dopamine transmission modulates the response of the regions of the brain involved in the anticipation and reception of rewards (Dreher et al.
2009). The COMT (catechol-
O-methyltransferase) gene polymorphism leading to an increased level of endogenous dopamine, and serotonin transporter (5-HTTLPR) polymorphisms have been associated to decision-making impairments (Heitland et al.
2012; Homberg et al.
2008; Malloy-Diniz et al.
2013). However, the results concerning 5-HT are somewhat contradictory (Gendle and Golding
2010; Heitland et al.
2012; Homberg et al.
2008; Koot et al.
2012; Lage et al.
2011; Macoveanu et al.
2013; Pittaras et al.
2013; Stoltenberg et al.
2011; Zeeb et al.
2009).
Several authors adapted the IGT in rodents (van den Bos et al.
2014) to study sex differences (van den Bos et al.
2012), neurobiological substrates (de Visser et al.
2011a,
b; Fitoussi et al.
2014; Homberg et al.
2008; Koot et al.
2012; Pais-Vieira et al.
2009; Peña-Oliver et al.
2014; Pittaras et al.
2013; Rivalan et al.
2013; Van Enkhuizen et al.
2013; Zeeb et al.
2009; Zeeb and Winstanley
2011) and environmental (Koot et al.
2013; Van Hasselt et al.
2012; Zeeb et al.
2013) or physiological features (de Visser et al.
2011a; Koot et al.
2012; Pais-Vieira et al.
2009) of decision-making processes. So far, the existence of inter-individual differences in decision-making has been linked to specific behaviors (Rivalan et al.
2009,
2013) and differential neuronal activation (Fitoussi et al.
2014; Rivalan et al.
2009).
As C57BL/6J mice are largely used in neurobehavioral studies worldwide, studying various features of their inter-individual variability could bring novel insight into their cognitive performance in general. These mice are genetically homogeneous, so finding neurobiological markers matching individual profiles is expected to provide robust bases for the emergence of different strategies during decision making, and eventually understanding which regional neurochemical lever could play on these individual traits of behavioral maladjustment. Moreover, we provide here for the first time another way of considering individual strategies during decision-making.
Discussion
We evidenced here inter-individual differences among healthy inbred mice during a decision-making task as already shown during a variant version of the IGT in humans (Bechara et al.
2002) and during the rat gambling task (Rivalan et al.
2009). We confirm and extend our previous report (Pittaras et al.
2013) that healthy C57Bl/6J mice behave differently in a mouse gambling task—MGT—and that behavioral differences rely on neurochemical and brain activation specificities. Solving the MGT requires first an exploration phase in which mice acquire information about each option, then an exploitation phase in which mice use their knowledge about the putative value and risk associated to each option (de Visser et al.
2011c). This knowledge naturally remains imperfect by nature as the response-outcome association is probabilistic. In the exploration phase, mice did not differ from each other. Inter-individual differences emerged only during the exploitation phase. At the end of the MGT, the 54 mice as well as the 24 mice used for immunochemistry, exhibited the same global evolution and inter-individual differences than reported previously (Pittaras et al.
2013). Furthermore, percentage of mice advantageous choices followed a Gaussian type distribution (Fig. S2B), similar to what was observed in a healthy human population during a variant version of the IGT (Bechara et al.
2002). As in humans and rats, a majority of mice (44 %, “average”) preferred advantageous options without neglecting alternative—potentially more risky—choices. Although we cannot rule out the hypothesis that these mice would improve performance if given a couple of more training sessions, we have evidence that their strategies differed from that exhibited by other subgroups the fifth session. We have unpublished data showing that two more sessions of MGT did not change average preferences. A small subgroup of mice (29 %, “safe”) preferred long-term advantageous choices and progressively avoided exploring other options by developing rigid behavior, doing a small number of switches and choosing arms associated with less quinine pellets (even if mice did not eat them). Another small proportion of mice (27 %, “risky”) continued to explore all available options throughout the experiment despite a low probability of getting a reward. Therefore, the MGT allows us to characterize three subgroups of animals regarding their decision-making strategies.
In the elevated plus maze (EPM), risky mice present the same profile as during the MGT, i.e., explorative and non-anxious behavior. This increased exploration of risky or ambiguous options was not associated to a general increase of locomotion, novelty exploration or to a deficit of working memory (Fig. S3). Furthermore, their performance in the MGT was not due to inability to distinguish large from small rewards because risky mice performed normally during the delay-reward task (Fig.
3). In addition, the expected sucrose preference (Ping et al.
2012) was only observed in safe and average groups, but not in the risky group. This apparently surprising result could explain the fact that risky mice were more attracted by novelty exploration than food reward and thus, when subjected to the MGT, continued to visit various arms, including those likely to contain quinine. Altogether, this information suggests that risky mice make choices independently of the probability to get quinine or reward. To that regard, it is noticeable that they did not show more activity in the insular cortex, associated with disgust (Chapman and Anderson
2012). Since food reinforcement is associated to a decreased DA and 5-HT in hippocampus and prefrontal cortex (González-Burgos and Feria-Velasco
2008), the high basal rates of monoamines in the hippocampus (Figs.
5d, h, S6D) of risky mice may prevent them to establish an appropriate action-outcome relationship. In addition, as DA and 5-HT in the hippocampus are necessary for learning and memory (González-Burgos et al.
2008), risky mice may be more prone to explore and learn spatial cues and hence to rely on external information by maintaining exploration phase.
It has been shown that 5-HT plays a key role during top-down control of decision-making (Van den Bos et al.
2013) but some authors found that a low level of extracellular 5-HT is linked with poor performance during decision-making (Heitland et al.
2012; Homberg et al.
2008; Koot et al.
2012; Pittaras et al.
2013; Zeeb et al.
2009) while others did not (Gendle et al.
2010; Homberg et al.
2008; Lage et al.
2011; Macoveanu et al.
2013; Stoltenberg et al.
2011). Here, we observed that risky mice had a high level of 5-HT in the prelimbic (PrL), insular cortices (CIns) and a low level of 5-HT in the orbitofrontal cortex (OFC). We suggest that unbalanced 5-HT levels between the different prefrontal areas—specifically between the OFC and the PrL—lead to more exploratory behavior despite potential risks.
Altogether, these data show that in a healthy mice population, some mice maintained exploration of available options even if associated to uncertain outcomes. A high level of 5-HT, DA and NA in the hippocampus and a low level of 5-HT in the OFC are expected to be markers of this extreme pattern of choices. It has been shown that sensation-seeking, risk-taking and high reactivity to novelty predicts a propensity to initiate cocaine self-administration (Belin et al.
2008,
2011). In addition, level of 5-HT in the OFC plays a key role during top-down control of decision-making (Van den Bos et al.
2013). Regarding these data, risky mice could be good models for vulnerability of addiction or pathological gambling.
Safe mice strongly preferred advantageous options during the MGT. However, they did not choose systematically the arm associated with the larger reward and did not earn more pellets than average mice (Fig.
2b): their apparently more efficient strategy which drives them away from exploration and penalty (quinine pellets), is in fact accompanied by rigid behaviors.
It has been shown that lesion of the OFC or PrL leads to unadapted decision-making (Granon et al.
1994; Rivalan et al.
2011). In addition, it was proposed that the exploration phase requires the activation of the limbic loop and the exploitation phase the activation of the cognitive loop, at the cost of the limbic loop (de Visser et al.
2011a; Koot et al.
2013). This was actually what we observed as safe mice exhibited a hypoactivation of the OFC and of the NAcc at the end of the task (Fig.
4a), two brain areas that are part of the limbic loop. Notably, safe mice exhibited reduced activation of the cognitive loop, specifically the PrL area, as compared to other subgroups. Hypoactivation in safe mice of brain regions involved in the integration of both limbic and cognitive information could explain their important rigidity score at the end of the task. Indeed, OFC, NAcc and PrL brain areas are known to be necessary for flexible behaviors (Boulougouris et al.
2007; Floresco et al.
2009; Mihindou et al.
2013; Young and Shapiro
2009). Moreover, c-fos protein activity in the PrL was negatively correlated with the animal’s performance and rigidity score; therefore we reinforce the fact that a low PrL activity is expected to be a marker of rigid behavior (Floresco et al.
2009). Since safe mice evaluated appropriately the reward value in the sucrose preference test (Fig.
3a) as well as in the delay reward task (Fig.
3d), their choices in the MGT are likely to be guided by penalty avoidance, to the detriment of exploration and flexibility. Low level of risk-taking of safe mice in the EPM reinforces this hypothesis. The monoamine pattern of safe mice is congruent with results obtained in monkeys showing inflexible behaviors associated to regional balance of DA and 5-HT (Groman et al.
2012).
Altogether, these data showed that in a healthy mice population, some mice favor safe strategies to avoid risk and penalty. Hypoactivation of brain areas involved in both limbic and cognitive loops associated with a high level of 5-HT in the OFC combined with low DA level in the CPu are expected to be markers of rigid but safe behavior. It has been shown that anxious subjects performing a risky decision-making task exhibited hypoactivation of the PFC in loss condition (Galván et al.
2014). Moreover, anxiety disorders during adolescence confer increased risk for depression during adulthood (Galván et al.
2014; Kendall et al.
2004; Pine et al.
1998). Although our safe mice did not show general higher level of anxiety in our current experimental conditions, their propensity to prefer conservative and rigid choices could be good traits for vulnerability of anxiety. This prediction would remain to be investigated.
Results of the current study indicate that within inbred healthy mice inter-individual differences exist and can be explained by specific network activity or regional neurochemical markers. As a social group, having different behavioral profiles could be an advantage, if individuals share outcomes. At an individual level, we characterized three different profiles: mice mostly driven by risk avoidance and internal cues, mice which preferred exploration of novel options even those associated to putative risks (these mice were mostly driven by environmental cues), and a third—and larger—subgroup of mice exhibiting balanced choices between the two former extreme profiles therefore showing adaptive decision-making.
In conclusion, we show for the first time that mice subjected to the MGT cope variously to uncertainty and can exhibit extreme patterns of choice and strategy, either rigid or flexible, related to specific monoaminergic and behavioral markers. We expect this work to open the way for the identification of valuable individual markers of vulnerability to psychiatric disorders.