Introduction

Suicidality is a major public health concern. Approximately 800,000 people end their lives by suicide annually around the world [1]. Concerns are being raised that the suicide rate will further increase with social isolation, barriers to care, and economic uncertainty secondary to the Coronavirus Disease 2019 [2]. It is known that 90% of individuals who complete suicide have a psychiatric illness [3], of which a substantial proportion have MDD [4]. Yet, suicidality impacts individuals with treatment-resistant depression (TRD) even more severely than those with MDD responsive to treatment, with evidence suggesting that 30% of those with TRD attempt suicide during their lives, which is two-to-four fold higher than in individuals with MDD responsive to treatment [5]. Treatments targeting suicidality in patients with TRD are needed.

Guidelines for the treatment of suicidality are limited with evidence mostly for reduction in suicidal ideation (SI) as opposed to suicide attempts and completions. Current options with reliable evidence of anti-suicidal effects include: lithium [6], clozapine [7], ketamine [8, 9], electroconvulsive therapy (ECT) [10], and certain psychotherapies [11]. Emerging evidence also suggests a role for repetitive transcranial magnetic stimulation (rTMS) for the treatment of SI [12,13,14]. Evidence for the effects of common, mostly serotonergic, antidepressants on suicidality is mixed. Two major reviews on suicide prevention highlight that randomized controlled trials (RCTs) of antidepressants show no direct benefit and may cause or worsen suicidality in younger patients [4, 11, 15]. However, epidemiological studies provide a different picture, and suggest a beneficial role for antidepressants on suicidality, yet these studies cannot prove causation [4, 11]. Studies that bridge the knowledge gap between RCTs and epidemiological studies may shed light on the true effects of common antidepressants on suicidality and better guide clinical practice.

The Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study remains the largest clinical trial in depression, with over 4000 patients consented at study outset [16]. This study was envisioned to blend rigorous clinical trial design with real-world generalizability through relaxed inclusion criteria and patient-guided treatment choices. The results from the STAR*D trial did not suggest that any particular antidepressant worked best for treating MDD, and it confirmed that with each additional failed medication trial the chance of treatment success diminishes [17, 18]. To date, there are no published reports on the effects of treatment on suicidality across all four levels of the STAR*D trial. Thus, we analyzed the STAR*D data to characterize predictors of change in suicidality in STAR*D participants. Given the relatively low levels of suicidality in this dataset, we chose to focus specifically on SI as opposed to suicidality in general. SI is a source of morbidity in itself, and also predicts suicide attempts; one cross-national estimate suggests that among those with SI there is a 29% conditional probability of making a suicide attempt [19]. SI is also thought to mediate the majority of risk for suicide attempts caused by psychiatric disorders [19, 20]. However, the prediction of who will attempt and complete suicide remains difficult, as these are rare events [20]. We also analyzed the effects of common antidepressants on SI in both TRD and nonresistant MDD. We hypothesized that younger age (≤25 years old) and TRD would correlate with higher SI. Based on previous results [21], we also hypothesized that lithium would reduce SI, and treatment with venlafaxine and mirtazapine would increase SI. Finally, we predicted that change in depressive symptoms would be correlated with changes in SI.

Materials and methods

Overall design

We performed a secondary analysis on the entire dataset of the STAR*D study (levels 1–4). Data were requested and granted with appropriate Data Use Certification from the NIMH Data Archive in 2018. All relevant demographic and clinical variables were extracted for analysis. The original study was approved by the institutional review boards of each participating study site, along with the Data Safety and Monitoring Board of the NIMH [16].

Participants

The STAR*D trial involved 14 regional centers and 41 treatment sites. Its eligibility criteria have been described in detail previously [17]. In brief, patients who provided informed consent were included if they were 18–75 years old and diagnosed with nonpsychotic MDD according to the DSM-IV with a baseline Hamilton Rating Scale for Depression (HRSD17) score ≥14 [16]. The study was designed to be inclusive and reflective of a real-world clinical population [16]. Importantly, patients with current suicidal ideation and ongoing risk for suicide were included if they were able to be treated as outpatients [22, 23]. Patients with past nonresponse or intolerance in their current depressive episode to treatments used in levels one or two of the study were excluded from the study.

Treatments by level

Patients were treated with an equipoise stratified randomized design to allow for treatment comparisons [17]. Table 1 summarizes the details of all available treatments organized by study level.

Table 1 Treatments by study level across the entire STAR*D study.

Outcome measures

In our analysis, SI was indexed by the suicide item (item three) of the HRSD17. Its scores range from 0 to 4: 0 (absent), 1 (feels life is not worth living), 2 (wishes he were dead or any thoughts of possible death to self), 3 (suicide ideas or gestures), 4 (attempts at suicide- any serious attempts). This was measured at baseline of level one and at the endpoints of each level. The endpoint score of each level served as the baseline score of the subsequent level in accordance with how the data were originally recorded and formatted. We chose to focus on the HRSD17 suicide item as the primary outcome, because the HRSD17 was the primary measure in the original reports, with the Quick Inventory of Depressive Symptomatology Scales (QIDS) (self-rated [SR] and clinician-rated [C]) as the secondary outcome measures [16]. The HRSD17 suicide item has been used to measure SI in many other clinical trials [10, 24]. We also performed several follow-up exploratory analyses using the QIDS-SR and QIDS-C. In the primary analysis, we measured mean change in SI as indexed by the HRSD17 suicide item. Change in score was chosen as opposed to resolution of SI, because the majority of patients in all levels, except level four, began the level with a baseline score of 0. Mean change in SI can capture both improvement and worsening of SI. When assessing for the correlation between change in SI and change in depression severity, the total HRSD17 score minus the score of the suicide item, referred to as the HRSD16 score, was used.

Statistical analysis

Available raw data were organized and analyses carried out in SPSS (version 26) and R (version 3.6.0). We completed descriptive statistics for the baseline and endpoint of each study level. All statistical tests carried out were two-tailed with α = 0.05. This was then corrected with Bonferroni correction per analysis grouping (e.g. for t-test analyses of change in suicide item score, where four tests were performed: α = 0.0125). We calculated Pearson correlation between change in SI (HRSD17 item 3) and change in overall depression (HRSD16) for each study level. We performed logistic regression analyses that included all baseline demographic factors and medication co-variates across all four study levels, and binarized improvement or worsening of SI from baseline of the respective study level. To account for conditional nesting of patients across levels, we included the baseline and endpoint SI scores of each preceding timepoint in the logistic regression analysis. For medication comparison outcomes on SI in levels 2–4, we grouped patients into those taking a specific medication versus all those not taking the medication in order to binarize outcomes by medication within that specific level. Last, we performed several additional exploratory analyses, including a replication of the primary logistic regression with patients separated into two groups: those with zero baseline SI scores and those with nonzero baseline SI scores.

Results

Table 2 presents the baseline demographic and clinical variables available for the 3784 patients included in this analysis. Their mean (SD) age was 39.8 (±12.9) years; 1832 (72.6%) were female and the majority were white. The mean (SD) age of onset for the first major depressive episode (MDE) was 25.5 (±14.4) years. The mean (SD) number of past MDE’s was 5.4 (±9.2), and 649 (16.4%) had a previous history of suicide attempt; 2074 (54.9%) had a family history of depression and 130 (3.4%) had a family history of suicide.

Table 2 Demographic variables of patients at baseline of the STAR*D study.

Overall, SI was relatively low. The only treatment level in which the majority of patients had SI (i.e., suicide item score >0) at baseline was level 4; in all other levels the majority of patients did not have any SI at baseline (i.e., suicide item score = 0). Table 3 presents the frequency of each HRSD17 suicide item score (i.e., 0, 1, 2, 3, 4) before and after each treatment level. Mean SI decreased across all four levels, however the absolute amount of improvement was larger and more significant in the first two levels than the last two levels, and the decrease in level 4 was not significant (where α = 0.0125); see Fig. 1.

Table 3 Count of suicide item scores (HRSD17 item 3: 0–4) at baseline and endpoint across all 4 study levels.
Fig. 1: Change in suicide item score for all depression scales across the STAR*D study.
figure 1

Error bars reflect standard error. 1 denotes significance at p < 0.001. 2 denotes significance at p < 0.0125.

Please refer to Table 4 for results of the primary logistic regression analysis. These results showed that past suicide attempts (OR 1.72, p = 0.007), comorbid medical illness (OR = 2.23, p = 0.005), unemployment with both searching for employment (OR 1.7, p = 0.04) and not searching for employment (OR = 1.76, p = 0.02), Hawaiian or Pacific Islander ethnicity (OR = 5.48, p = 0.045), history of antisocial personality disorder (OR 3.52, p = 0.042), and family history of drug abuse (OR 1.69, p = 0.008) predicted worsening SI across level one. Longer length of education (OR = 0.91, p = 0.036), and older age (OR = 0.99, p = 0.04) predicted lowering of SI in level one. In level two, bupropion (OR 0.24, p < 0.001), buspirone (OR 0.24, p = 0.001), sertraline (OR 0.36, p = 0.02), and venlafaxine (OR 0.34, p = 0.017) treatment as well as unemployment and not searching for employment (OR = 0.46, p = 0.0499) and family history of bipolar disorder (OR = 0.33, p = 0.039) predicted lowering of SI. Cognitive treatment was not associated with either worsening or lowering of SI (OR 0.91, p = 0.77) across level two. Diagnoses of Axis I disorder (other) (OR = 2.33, p = 0.04) and lack of Axis II diagnosis (OR = 3.66, p = 0.045), Male gender (OR 1.92, p = 0.023), receiving Medicaid benefits (OR = 2.46, p = 0.04) and being widowed (OR = 4.05, p = 0.043) was associated with worsening of SI across level two. When correcting for multiple comparisons, where α = 0.0125, only past suicide attempts (OR 1.72, p = 0.007), comorbid medical illness (OR = 2.23, p = 0.005), and family history of drug abuse (OR 1.69, p = 0.008) in level one, and bupropion (OR 0.24, p < 0.001) and buspirone (OR 0.24, p = 0.001) treatment in level two remained significant predictors. There was no treatment correlated with improvement in SI in levels three and four, even when the regression was modified to include only the medication co-variates. When accounting for conditional nesting, higher baseline SI at the beginning of level 1 was associated with worsening SI across level 2 (OR 1.87, p < 0.001), and higher SI at the end of level 1 was associated with lowering of SI across level 2 (OR 0.56, p = 0.001). Higher baseline SI at level 1 was associated with worsening of SI across level 3 (OR 1.91, p = 0.006), while higher SI at the end of level 2 was associated with lowering of SI across level 3 (OR 0.37, p = 0.002). SI at the end of level 1 was not associated with changes in SI across level 3. Baseline SI at level 1, and SI at the end of levels 1, 2, and 3, were not associated with changes in SI across level 4. Please see Supplementary Table S1 for results of parallel logistic regression analyses with the QIDS-SR and QIDS-C scales.

Table 4 Significant predictors of suicidal ideation by treatment level.

Additional exploratory logistic regressions were completed. To assess for outcomes of patients stratified by baseline level of SI, we repeated the above analysis across level 1 with patients separated into those with zero SI scores at baseline and those with nonzero SI scores at baseline. In comparison with the results from the unified sample above, additional significant co-variates were found as follows: those with baseline zero scores had an additional significant co-variate of Asian ethnicity (OR = 3.25, p = 0.032), and those with baseline nonzero scores had an additional significant co-variate of family history of suicide (OR = 6.18, p < 0.001), both predicting worsening of SI. Baseline HRSD-16 score, at level one, did not predict presence of SI at level 3 (OR = 1.03, p = 0.198). Baseline SI (OR 1.24, p < 0.001) and baseline depression severity, at level one, as indexed by the HRSD-16 (OR 1.06, p < 0.001), both independently predicted the odds of TRD (defined as receiving level three treatment and thereby having failed two full treatment trials). Every 1-point increase of baseline SI increased the odds of TRD by 24%. Every 1-point increase of baseline depression score increased the odds of TRD by 6%.

The correlation between change in SI and overall depression (HRSD16) was weak to moderate and significant throughout all levels of the study (level one: [N = 3758], r = 0.48, p < 0.001; level two: [N = 1029], r = 0.38, p < 0.001; level three: [N = 251], r = 0.31, p < 0.001; level four: [N = 77], r = 0.42, p < 0.001) (α = 0.0125).

Discussion

We report outcomes on SI as indexed by the HRSD17 suicide item in the STAR*D trial, which is the largest sample of patients with MDD treated in a prospective clinical trial. SI decreased across all four treatment levels. However, this decrease became smaller with each successive level, suggesting that SI is associated with depression treatment resistance. Also, baseline SI predicted the presence of both SI and TRD later in the study. Baseline depression severity did not predict the presence of SI later in the study. In this adult sample, younger age was not correlated with worsening SI. Beyond level 2, no treatment was associated with improvement in SI, including lithium, an agent with known anti-suicidal properties. Change in SI and depression were correlated at all levels of the study, but the correlations were weak-to-moderate.

This report is the first published analysis on SI outcomes in the STAR*D trial across this entire study. Previous studies have reported suicide-related outcomes in level one of the STAR*D. One analysis identified baseline factors associated with previous suicide attempts: being older, female, a higher baseline HRSD17 score, and current suicidality [22]. Zisook et al. performed a similar analysis to ours with the suicide item of the QIDS-SR across level 1 of the STAR*D [23]. They found that male gender, being currently treated in a psychiatric facility, and melancholic features were associated with worsening of suicidality. Also, those whose depression did not respond to citalopram in level 1 were more likely to have SI at baseline and had less improvement of their SI, and were more likely to have emergent suicidality [23]. These results limited to level 1 are consistent with our overall findings of an association between SI and the presence of TRD. In our study we found that level 4 patients, i.e., those with the highest degree of depression treatment-resistance, also had the highest proportion of patients with SI at baseline. Similarly, those with SI at baseline were more likely to develop TRD than those without SI at baseline; this was independent from and surpassed the predictive power of depression severity at baseline. Overall, these findings confirm and extend existing evidence suggesting higher SI, and potentially suicide risk, in patients with TRD than those with nonresistant depression [5].

Numerous studies over the past few decades have raised concerns over the potential for antidepressants to worsen suicidality [4, 11]. This clinical issue is complex and difficult to resolve, as different patient populations with varying illness severity take many various antidepressants. Both randomized clinical trials and epidemiological studies fail to capture the entirety of the clinical story in this situation. To address these issues, and build on our results in the future, the effects of antidepressants on suicidality should be assessed in large, cohort-type studies in patients with MDD receiving standardized treatment within the context of integrated care pathways; the outcomes could be analyzed with novel techniques, such as machine learning [25, 26]. Such studies should incorporate measures of suicidality that are accessible, reliable and able to be easily integrated.

This report has multiple limitations. First, while STAR*D was designed to include patients with suicidality, the baseline level of SI was low and the primary outcome was remission from MDD, not SI. Also, the STAR*D study had a large sample and real-world generalizability, but it was an outpatient study. Thus, it did not reflect the often intense suicidality of inpatients with MDD. We used the suicide item of the HRSD17 to assess SI, and this may have limited our ability to detect subtle changes in suicidality. STAR*D was a randomized study but it was not double blind, which may have influenced depression scale ratings. Finally, our analyses in levels 3 and 4 were limited by relatively small sample sizes, which may explain why we did not detect a significant suicidal-protective effect for lithium.

The findings from this report confirm that common antidepressants have a beneficial effect on SI in patients with MDD. However, they appear to have less of an anti-suicidal effect in patients with TRD. SI itself can predict poor depression treatment response. Therefore, in patients with TRD and comorbid SI, alternative treatments with anti-suicidal properties should be considered early in the sequence of treatments. Structured psychotherapies, such as cognitive behavioral therapy (CBT) and dialectical behavior therapy (DBT) are recommended to prevent suicide in general [11, 27], with modest effect sizes for reduction of SI with CBT [28]. There is also evidence for DBT as a treatment for TRD itself [29], but not as a treatment specifically for SI in TRD. Evidence for these psychotherapy modalities mostly do not extend to the treatment of SI in the TRD patient population at this point. There is in fact very limited evidence to facilitate choice of treatment in these difficult clinical situations [5]. Treatment of TRD and comorbid SI is often guided by extenuating clinical circumstances, and patient preference to avoid certain side effects. Evidence suggests that ECT remains the mainstay treatment [10, 11, 30], with lithium [6] and experimental ketamine [9, 31] as alternatives, and other emerging, experimental treatments such as rTMS [14], and potentially psychedelic medicines [32] as focuses of future investigation.

Funding and disclosures

ZJD has received research and equipment in-kind support for an investigator-initiated study through Brainsway Inc and Magventure Inc. His work is supported by the Canadian Institutes of Health Research (CIHR), the National Institutes of Mental Health (NIMH), Brain Canada and the Temerty Family and Grant Family and through the Centre for Addiction and Mental Health (CAMH) Foundation and the Campbell Institute. BHM holds and receives support from the Labatt Family Chair in Biology of Depression in Late-Life Adults at the University of Toronto. He currently receives research support from Brain Canada, the Canadian Institutes of Health Research, the CAMH Foundation, the Patient-Centered Outcomes Research Institute (PCORI), the US National Institute of Health (NIH), Capital Solution Design LLC (software used in a study founded by CAMH Foundation), and HAPPYneuron (software used in a study founded by Brain Canada). He directly own stocks of General Electric (less than $5000). Within the past 3 years, he has also received research support from Eli Lilly (medications for a NIH-funded clinical trial) and Pfizer (medications for a NIH-funded clinical trial). DMB receives research support from CIHR, NIH, Brain Canada and the Temerty Family through the CAMH Foundation and the Campbell Family Research Institute. He received research support and in-kind equipment support for an investigator-initiated study from Brainsway Ltd. He is the site principal investigator for one sponsor-initiated study for Brainsway Ltd. He also receives in-kind equipment support from Magventure for investigator-initiated studies. He received medication supplies for an investigator-initiated trial from Indivior. He has participated in one advisory board meeting for Janssen. DK has received research support from an NSERC Discovery Grant and Discovery Accelerator Supplement, McLaughlin Centre Accelerator Grant, the University of Toronto Connaught award, and the University of Toronto Mississauga Research and Scholarly Activity Fund. DY has received research support from CANSSI Postdoctoral Fellowship. CRW, IH, and BJ report no financial relationships with commercial interests. There was no funding specific to the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.