INTRODUCTION

Bipolar disorder is a chronic and recurrent illness, with a lifetime incidence of at least 1% (Kessler et al, 1994). Although the illness is defined by the occurrence of mania, the depressed phase predominates (Judd et al, 2002, 2003) and represents the greatest therapeutic challenge. Pronounced neurocognitive dysfunction is also frequently described in symptomatic bipolar patients and there is increasing evidence of specific impairments which may persist in euthymia and therefore represent a relatively enduring abnormality (Ferrier et al, 1999; Ferrier and Thompson, 2002; Thompson et al, 2000). It has been suggested that abnormalities in hypothalamic–pituitary–adrenal (HPA) axis function may cause or exacerbate both neurocognitive impairment and depressive symptoms (McQuade and Young, 2000; Sapolsky, 2000).

Indirect evidence for this link is found in conditions, such as Cushing's syndrome, which are characterized by a chronic elevation of endogenous cortisol levels and have consistently been shown to be associated with significant neurocognitive impairment (Forget et al, 2000; Mauri et al, 1993; Starkman et al, 2001; Whelan et al, 1980) and a high incidence of depression, which notably resolves with correction of the hypercortisolaemia (Dorn et al, 1997).

In healthy volunteers, both acute (Lupien and McEwen, 1997) and subchronic (Young et al, 1999) administration of the synthetic steroid, hydrocortisone, causes reversible impairments in neurocognitive function. Several studies have reported reduced verbal declarative memory function (Newcomer et al, 1999). This may be the result of a specific deficit in memory retrieval (de Quervain et al, 2000, 2003), although there is evidence to suggest that working memory function may be more sensitive than declarative memory to the effects of elevated corticosteroid levels (Lupien et al, 1999; Young et al, 1999).

In mood disorders, the greatest incidence of HPA axis abnormalities are found in bipolar and psychotic unipolar disorder (Rush et al, 1996) and reduction of cortisol levels in these conditions may therefore ameliorate depression and improve neurocognitive functioning (Reus and Wolkowitz, 2001). In keeping with this view, preliminary data suggests that cortisol synthesis inhibitors may be antidepressant (Brown et al, 2001). However, they are associated with a significant side effect burden and their efficacy may be compromised by the increased production of other neuroactive steroids.

At high doses, the progesterone receptor antagonist mifepristone (RU-486) is an antagonist of the glucocorticoid receptor (GR) subtype of corticosteroid receptor. Preliminary reports have found that mifepristone and the novel GR antagonist ORG-34517 have antidepressant effects in both psychotic and nonpsychotic unipolar depression, particularly in subjects with high rates of hypercortisolaemia (Belanoff et al, 2002; Høyberg et al, 2002).

We therefore sought to establish proof-of-concept for the use of GR antagonists in the treatment of bipolar disorder. We hypothesized that mifepristone administration would both enhance neurocognitive functioning—specifically in domains that are most sensitive to the effects of elevated corticosteroids—and improve depressive symptoms.

METHODS

Subjects

Patients aged 18–65 years with a diagnosis of bipolar disorder, confirmed using the Structured Clinical Interview for DSM-IV (SCID) (First et al, 1995), were recruited from services in North East of England. A specific attempt was made to recruit those with residual depressive symptoms. Illness characteristics, clinical ratings, and medication history were determined by trained psychiatrists using full history, case-note, and medication review and standardized rating scales. Patients' medication had been unchanged for 6 weeks prior to participation and remained so throughout the study period. Seventeen were taking at least one mood stabilizer, with 13 taking at least one antidepressant and 11 taking an antipsychotic.

After a complete description of the study, written informed consent was obtained from all participants; the study received full approval from the local ethics committee.

Procedure

Following an initial baseline assessment of neurocognitive function and mood, and basal neuroendocrine profiling (day 0), patients were randomly allocated to receive either 600 mg mifepristone (taken orally at 08 00 once a day) or placebo for 7 days. Administration of medication was in a double-blind design. Mood ratings were taken after the week's treatment (day +7) and then at weekly intervals (day +14 and day +21). At day 21, the groups crossed over and the alternative treatment (placebo or mifepristone) administered for 7 days, again with ratings taken following the week's treatment (day +7) and at weekly intervals (day +14 and day +21). Neurocognitive function was assessed on three occasions over the study period: at baseline and at day +21, after each treatment. Neuroendocrine profiling was performed at baseline, after the week's treatment period (day +7) and then day +21.

Neurocognitive Testing

Based on previous research on the effects of corticosteroids on neurocognitive function (de Quervain et al, 2000, 2003; Lupien et al, 1999; Newcomer et al, 1999; Young et al, 1999), it was predicted that the principal cognitive domains which would be most sensitive to changes in HPA axis function were working memory and verbal declarative memory. The primary neurocognitive battery therefore consisted of two tests:

  1. 1

    The Spatial Working Memory Task: This computerized test of working memory from the Cambridge Neuropsychological Test Automated Battery (CANTAB; CeNeS Pharmaceuticals, Cambridge, UK) requires subjects to search through an increasing number of (three, four, six, and eight) boxes to locate hidden tokens. As the token is never located in the same box more than once, ‘between search errors’ are committed when the subject returns to search a box in which a token has previously been located.

  2. 2

    The Rey-Auditory Verbal Learning Test (Rey-AVLT): This test of verbal learning, includes indices of initial and delayed recall and recognition. A list of 15 words (List A) is read out to the subject five times, which they are required to recall after each trial. A different list of 15 words (List B) is then read once, followed by recall of this list. Finally, subjects are required to recall words from List A without an additional presentation of that list. After a 30 min delay, recall of List A is again tested, followed by a recognition trial of words from List A. The number of words correctly recalled or recognized are recorded. Alternative forms of the test were used on each visit.A secondary battery was also included which examined a broader range of neurocognitive domains, incorporating additional measures of learning and memory, attention and executive function:

  3. 3

    Short-term memory span: This was tested across both phonological and spatial domains. The Wechsler forward digit span test requires subjects to repeat verbatim a string of digits which sequentially increases in length until the consecutive failure of two trials of the same digit span length. The CANTAB spatial span task was utilized to assess the subjects' ability to remember a serial sequence of squares as they change color.

  4. 4

    Visuo-spatial learning and memory: This was assessed using the CANTAB pattern and spatial recognition tests. The pattern recognition task requires the subject to learn a series of 12 abstract patterns before being presented with pairs of patterns. Subjects are required to identify the familiar one. The test consists of two sets of 12 stimuli. For the spatial recognition test, the subject must learn the on-screen spatial position of five serially presented squares, with a subsequent forced-choice recognition between two locations. A total of four trials of five stimuli are completed. Alternative forms of both tests were used on each visit.

  5. 5

    Executive function: This was tested using an established verbal fluency test (naming words beginning with one of three given letters; 60 s for each) with the overall total correct responses recorded. The Wechsler backward digit span, which requires the monitoring of information held in working memory, was also administered using the same method as the forward span test. Alternative forms of both tests were used on each visit.

  6. 6

    Attention: This was assessed using the digit symbol subtest from the Wechsler Adult Intelligence Scale; a test requiring rapid copying of symbols paired with numbers in 90 s. Alternative forms of the test were used on each visit. A computerized continuous performance task—Vigil (Cegalis and Bowlin, 1991)—was also employed. In this random-interval ‘A–K’ form, subjects are required to respond to the target letter ‘K’ only when it is preceded by the letter ‘A’ from among a stream of random letters over an 8 min period.

All pen-and-paper tasks were administered according to standardized instructions (Lezak, 1995) and computerized tests from the CANTAB according to the manual protocols, on a personal computer fitted with a color touch-screen monitor. For all subjects, testing began at 1300 and took approximately 75 min to complete.

Symptoms

With respect to symptomatic improvement, the antidepressant effect of mifepristone was the principal focus, therefore the outcome measures of interest were the 17-item Hamilton Depression Rating Scale (HDRS17; Hamilton, 1960) and the Montgomery–Asberg Depression Rating Scale (MADRS; Montgomery and Asberg, 1979). Other secondary scales consisted of the Brief Psychiatric Rating Scale (BPRS; Overall and Gorham, 1962) and the Young Mania Rating Scale (YMRS; Young et al, 1978).

Neuroendocrine Assessment

To profile plasma cortisol secretion, subjects were canulated in the antecubital fossa at 1230 and blood samples collected at 30 min intervals from 1300 to 1600. Subjects fasted throughout this period, remained semi-supine and did not sleep. Cortisol levels were determined by using Corti-cote radioimmunoassay kits (ICN Pharmaceuticals, Costa Mesa, California). The interassay coefficient of variation for cortisol was less than 8%, and the intra-assay variation was less than 9% across the assay range.

Statistical Analysis

Neurocognitive data were analyzed by repeated measures analysis of covariance (ANCOVA) with ‘treatment’ (mifepristone or placebo) and, where tests had more than one level, ‘level’, as the within-subject factors. As differential learning effects may occur depending upon the order of treatment administration, ‘order’ (mifepristone first or placebo first) was entered as a between-subjects factor and ‘baseline’ performance as a covariate. Main effects were further examined as the mean difference (and 95% confidence interval (CI) of the difference) between treatments (mifepristone or placebo), expressed as a change from baseline performance (Altman et al, 2000). Mood symptoms were also expressed as the mean change (95% CI) from baseline for each treatment and analyzed by paired t-test. All cited p-values were two-tailed, with a significance level set at 0.05. Analyses were performed using SPSS vs 9 (SPSS, 1998).

RESULTS

One patient was excluded from the study because of self-discontinuation of lithium prophylaxis. Data from 19 patients were available for analysis.

Patients were aged between 26 and 63 years (mean=49 years, SD=11) and had no current or past diagnosis of substance abuse or dependence. At baseline, all patients had persistent depressive symptoms, with 17 fulfilling SCID criteria for current depressive episode (see ratings below). The median length of current depressive episode in the group was 7 months (mean=13.5, SD=15.7). Depressive symptoms had a mean score of 23 (SD=10) on the MADRS and of 18 (SD=10) on the HDRS17. The mean MADRS and HDRS17 scores of the three patients without a specific episode were 8 (SD=5) and 4 (SD=1), respectively. The average YMRS score in the whole group was 4 (SD=4).

Nine patients had previously attempted suicide. The median number of hospitalizations in the group was 3.

Neurocognitive Testing

Data are presented in Table 1.

Table 1 Neurocognitive Test Results

Primary Outcome Measures

A significant ANCOVA main effect of treatment was found in the between search error rate of the spatial working memory task. Subsequent analysis of this significant main effect revealed that, following mifepristone treatment, the error rate was significantly reduced from baseline (t=2.89, df=18, p=0.010). However no significant change occurred following placebo (t=1.39, df=18, p=0.181). Direct comparison of the treatments revealed a significant advantage of mifepristone over placebo in the percentage improvement (calculated for each individual subject) in error rate from baseline (mean difference=19.8%, 95% CI=4.3–35.2; t=2.69, df=18, p=0.015) (see Figure 1). Order of treatment administration did not appear to be a confounding factor. The improvement following mifepristone was not significantly different in the group who received mifepristone first compared to the group who received it second. Again there was no difference in the response to placebo between these groups (p>0.2 for all). There was also no ANCOVA main effect of order or treatment by order interaction (see Figure 2). There were no significant main effects of treatment on any outcome measure from the Rey-AVLT (total correct, long-term recall or recognition).

Figure 1
figure 1

Mean (SEM) percentage improvement in Spatial Working Memory between search error rate from baseline. See main text for statistics.

Figure 2
figure 2

Mean (SEM) percentage improvement in Spatial Working Memory between search error rate from baseline following mifepristone or placebo, separated by group (subjects receiving mifepristone first vs those receiving placebo first).

Secondary outcome measures

ANCOVA main effects of treatment was found in both verbal fluency and spatial recognition memory (see Table 1).

For verbal fluency, the number of words correctly produced was significantly greater than at baseline following mifepristone treatment (t=3.34, df=18, p=0.004) with no significant difference following placebo (t=1.57, df=18, p=0.133). Direct comparison of each treatment, expressed as a percentage improvement from baseline, did not significantly differ (mean difference=1.60%, 95% CI=−9.89 to 13.10; t=0.29, df=18, p=0.773). For the spatial recognition task, direct comparison of mifepristone vs placebo, expressed as a percentage change in error rate from baseline, revealed a trend towards a lower error rate following mifepristone (mean difference=27.2%, 95% CI=−1.81 to 56.17; t=1.97, df=18, p=0.064).

Symptoms

At +14 days, following treatment with mifepristone, depression rating scores from the HDRS17 and MADRS had significantly improved from baseline levels (see Table 2). No significant change was observed at any time point following placebo. Direct comparison of the advantage of mifepristone over placebo at this time point (+14 days), however, failed to reach statistical significance for either HDRS17 scores (mean difference=2.32, 95% CI=−2.08 to 6.71; t=1.107, df=18, p=0.283) or MADRS scores (mean difference=2.26, 95%CI=−3.36 to 7.89; t=0.845, df=18, p=0.409).

Table 2 Symptom Ratings at Baseline and Weekly Following Mifepristone or Placebo

An independent samples t-test was used to confirm that the order of treatment administration was not a confounding factor. There was no significant difference in response to the active treatment, between the group receiving mifepristone first or the group receiving it second in either HDRS17 scores (t=0.054, df=17, p=0.958) or MADRS scores (t=0.554, df=17, p=0.587).

Of the secondary scales, BPRS scores were also found to be significantly lower at +14 days following mifepristone treatment, with no change following placebo (see Table 2). Again, however, comparison of the advantage of mifepristone over placebo at this time point failed to reach statistical significance (mean difference=1.11, 95% CI=−3.00 to 5.22; t=0.564, df=18, p=0.579). YMRS scores did not significantly differ from baseline at any time point. A post hoc analysis was performed on all symptom effects, after the exclusion of the three patients who did not fulfill SCID criteria for a current depressive episode. The improvement from baseline at +7 days remained significant for all measures (p<0.05).

Neuroendocrine Measures

A highly significant ANOVA main effect was observed (F=20.6, df=4,68, p<0.0001), with cortisol levels being significantly higher following mifepristone treatment (day +7) compared to all other visits (see Figure 3). A significant diurnal rhythm was evident in the effect of time (F=21.6, df=6,102, p<0.0001), although there was no interaction between visit and time (F=1.18, df=24,408, p=0.29). No other significant effects were observed.

Figure 3
figure 3

Cortisol levels (nmol/l) at baseline, after 1 week treatment (day +7) and +21 days following mifepristone or placebo.

An exploratory post hoc analysis revealed that the area under the curve (AUC) cortisol output at baseline correlated positively with the percentage improvement in spatial working memory error rate following mifepristone administration (rs=0.460, N=19, p=0.048). No relationship was found between cortisol AUC and the error rate following placebo (rs=0.286, N=19, p=0.235).

DISCUSSION

These data suggest that the GR antagonist mifepristone selectively improves neurocognitive function and may be antidepressant in bipolar disorder. Spatial working memory function was significantly improved from baseline compared to placebo (see Figure 1). Subtle improvements in secondary measures of verbal fluency and spatial recognition memory were observed. Ratings of depression (HDRS17 and MADRS) and total BPRS scores were also significantly reduced compared to baseline after treatment with mifepristone, but not after treatment with placebo. The pattern of symptomatic response was identical on all these objective rating scales. The superiority of mifepristone when directly compared to placebo, however, failed to reach significance, possibly due to a lack of statistical power. The symptomatic improvement was evident 2 weeks after the initiation of treatment, faster than would be expected from conventional therapeutic strategies in bipolar disorder. Future studies will need to ascertain how this improvement can be maintained.

GR dysfunction may be of etiological importance in bipolar disorder. This notion is supported by neuroendocrine studies which have shown that 43% of depressed bipolar patients are DST nonsuppressors (Rush et al, 1996) and that the dexamethasone/corticotropin releasing hormone (dex/CRH) test is abnormal during relapse, recovery (Rybakowski and Twardowska, 1999; Schmider et al, 1995; Watson et al, in press) and in apparently healthy subjects with genetic loading for mood disorders (Lauer et al, 1998). It is also supported by post-mortem studies which show evidence of reduced GR mRNA expression in post-mortem brain tissue samples from patients with bipolar disorder (Knable et al, 2001; Lopez et al, 2003; Webster et al, 2002). The efficacy of mifepristone may therefore be secondary to its action at the GR. This is further supported by the finding that many antidepressant drugs increase GR binding and/or number in brain tissue, suggesting that GR regulation may be one aspect of the therapeutic mechanism of action of antidepressants (and mood stabilizers) and that the ability of a drug to regulate GR number may be a good predictor of therapeutic efficacy in patients with hypercortisolaemia (McQuade and Young, 2000).

It may, however, seem paradoxical that a disorder associated with reduced function of the GR may be treated using a GR antagonist. A recent study has reported a persistent reduction in glucocorticoid bioactivity after a single dose of mifepristone (200 mg) which normalized 2 weeks after the treatment (Heikinheimo et al, 2003). This adds support to the notion that mifepristone potentially acts by ‘resetting’ the homeostatic set point of the HPA axis (Belanoff et al, 2002).

Interestingly, RU-486 was the only GR antagonist examined in a recent study to increase both mineralocorticoid receptor (MR) and GR binding in the frontal cortex (Bachmann et al, 2003). This may underpin the selective pattern of improvement in neurocognitive function seen in the present study, which was restricted to tests which have been shown to be sensitive to frontal lobe dysfunction (Owen et al, 1995). The improvement in neurocognitive function was demonstrated at a point at which mood symptoms did not differ either from baseline or when compared to placebo. This suggests that the cognitive enhancing effect is not simply related to improvement in depressive symptoms.

It is perhaps surprising that—given the well-documented effects of corticosteroids at the hippocampus—mifepristone had no effect on verbal declarative memory function. This may be due to a difference in the sensitivity of the tests to detect changes in the relatively small number of patients in the study. Alternatively, it may be due to the timing of the neurocognitive assessments. Due to the preliminary nature of the study and the limited number of times neurocognitive testing can reliably be carried out, the assessments were performed at baseline and then 14 days after cessation of each treatment. This time point was selected so as to avoid the acute effects of the drug (when brain GR would be occupied and peripheral cortisol levels are greatly elevated; see Figure 3) and examine the longer term antiglucocorticoid effects (Heikinheimo et al, 2003). Greater improvements may have been observed if neurocognitive function was assessed at an earlier time-point. Also, the possibility of order effects cannot entirely be ruled out (see Figure 2), but these are difficult to assess due to a reduction in sample size if measured separately by the order of treatment administration.

There were no drop-outs due to side effects in either phase of the study and no patients experienced a manic relapse. This is the largest double-blind, placebo-controlled study of mifepristone in mood disorders and the first in bipolar disorder. However, the number of patients studied was relatively small and this preliminary result requires confirmation in studies of larger numbers of patients with bipolar depression; preferably in a between-subjects design, thereby avoiding the problems inherent in a crossover design. In addition, although high rates of hypercortisolaemia and other HPA axis abnormalities occur in bipolar disorder (and may influence the response to mifepristone) we cannot be sure of the prevalence of GR dysfunction in the cohort of patients recruited for our study. Although the correlation observed in the present study between baseline basal cortisol levels and the neurocognitive response to mifepristone (spatial working memory error rate) may suggest that patients with the greatest HPA axis abnormality respond better to GR antagonists. In future studies, HPA axis function should be fully profiled at baseline, as both a predictor of response to antiglucocorticoids and as a method of ascertaining the degree of ‘normalization’ of the axis following treatment.

The results of the present study add direct support to the notion that the GR is an important modulator of neurocognition and mood in bipolar disorder and that adjunctive administration of drugs that specifically target this receptor may be of therapeutic benefit. Our results require replication but provide preliminary evidence that GR antagonists selectively improve neurocognitive function and may have antidepressant properties in bipolar disorder. Such drugs hold promise for the treatment of bipolar disorder.