After discussing important issues concerning the re-analyses of the FDA data on suicidal behavior in antidepressant (AD) trials by Kaminski and Bschor (2020) (henceforth KB) and Hengartner and Plöderl (2019) (henceforth HP), we decided to publish a collaborative response. We want to address several limitations of our publications and add information necessary for clarifying the controversial question if treatment with ADs is associated with increased suicide risk.
We agree that meta-analytic methods produce more accurate estimates than the crude estimates HP calculated based on the contingency table in the initial publication. Crude estimates could be exaggerated.
We agree that several real-world studies based on large health data registries indeed found that suicide risk is highest in the first weeks after initiating treatment with ADs (e.g. Coupland et al., 2015; see online supplement for more references https://osf.io/qzjva/). Some recent studies found significantly increased suicide risk with ADs (Björkenstam et al., 2013; Coupland et al., 2015), but the findings of observational and ecological studies are inconsistent. Ecological studies in particular provide low quality evidence. They are prone to many biases, inadequately control for important confounders (most do not even attempt to control for potential confounders) and thus cannot demonstrate cause-effect relationships.
We agree that two placebo-suicides in the paroxetine trials need to be removed for a correct analysis. These two suicides occurred during the lead-in (washout) phase, that is, before randomization, as confirmed by official documents from GlaxoSmithKline (Davies, 2002). Crucially, this correction lead to statistically significant results in most meta-analyses, especially when using the full data set including the fluoxetine and bupropion trials (Hengartner and Plöderl, 2019). However, it is also important to note that according to the same document (Davies, 2002), all paroxetine-suicides occurred in trials without placebo-control.
Another issue, addressed in vivid Twitter-discussions (e.g., https://tinyurl.com/yx267swz), is that the FDA-data we used comprised all phase II and III trials, including placebo-controlled trials, open-label (safety-extension) trials, and head-to-head trials (i.e. active-controlled trials without placebo arm). Critics claimed that nothing can be inferred from these data, because head-to-head trials would include more severely depressed patients and, consequently, participants with a higher baseline suicide risk. Although this is an important issue warranting further research, we would argue that suicidal patients are commonly excluded not only from placebo-controlled trials, but also from head-to-head trials. Moreover, baseline depression severity and dropout rates due to adverse events do not differ between placebo-controlled and head-to-head trials (Salanti et al., 2018). Finally, an early FDA-analysis by Laughren (2001) based exclusively on short-term placebo-controlled trials found suicide rates for AD- and placebo-arms that are very similar to our crude estimates, specifically, 0.10% vs. 0.02% compared to 0.12% vs. 0.02% in our corrected data table (Hengartner and Plöderl, 2019).
Given that the data was collected as suicide attempts and suicides, KB emphasize the necessity of analyzing the two entities separately. HP and KB agree that the two adverse events are related, however, clinically different, and could represent different sources of variance. On the other hand, HP and KB agree that due to the rare occurrence of the event, pooling across suicide attempts and suicides helps at detecting a possible signal for a risk of suicidal behavior in AD trials.
We agree that the Bayesian analysis of HP produced skewed results because the sampling procedure included some extreme OR-values. KB used a Bayesian method that in part overcame the inherent limitations in HP’s approach. KB also used weakly informative priors, which is especially suitable for rare/missing events (Günhan et al., 2020; Kuss, 2015). We now applied KB’s Bayesian approach to the corrected data (Table 1). For the suicide data, there were again some occasional large OR-values in the posterior distributions, making the results somewhat unstable (see online supplement for details). Nonetheless, the resulting median showed increased risk for ADs in all analyses, with the 95% credible intervals always excluding the null-effect (OR = 1).
Table 1. Bayesian Analysis of Corrected Data (Median of ORs, 95% Credible Interval).
|Corrected data||Corrected data with fluoxetine and bupropion|
|Suicide attempts||Suicides||Suicides and suicide attempts||Suicide attempts||Suicides||Suicides and suicide attempts|
|Noninformative prior||1.7 (1.1 – 3.0)||3.7 (1.2 – 18)||2.0 (1.3 – 3.4)||1.9 (1.2 – 3.4)||3.9 (1.3 – 19)||2.1 (1.4 – 3.9)|
|Weakly informative prior||1.7 (1.1 – 3.0)||3.5 (1.2 – 15)||1.9 (1.2 – 3.4)||1.9 (1.2 – 3.4)||3.7 (1.2 – 17)||2.1 (1.3 – 3.9)|
|Very informative prior||1.7 (1.1 – 2.9)||2.9 (1.1 – 10)||1.9 (1.2 – 3.3)||1.8 (1.2 – 3.2)||3.1 (1.2 – 11)||2.1 (1.3 – 3.7)|
|Bayesian analysis HP||5.7 (1.4 – 427)||6.3 (1.6 – 366)|
|Crude analysis||2.4 (1.6 – 3.6)||5.5 (1.7 – 36.2)||2.7 (1.8 – 4.0)||2.5 (1.7 – 3.8)||5.9 (1.8 – 38.8)||2.5 (1.8 – 3.7)|
ORs > 1 mean that the suicide risk is higher with ADs compared to placebo.
Results are rounded to the first decimal or to the whole number for upper limits of the credible interval, because, for Bayesian analyses, the results slightly varied from simulation to simulation, especially for the suicide data (see online supplement for further information).
Noninformative prior: delta = 50000; weakly informative prior; delta = 250, highly informative prior: delta = 15.
The crude analyses were based on the aggregated data tables, similar to Hengartner & Plöderl’s original letter.
The corrected data included the following ADs: citalopram, desvenlafaxine, duloxetine, escitalopram, levomilnacipram, mirtazapine, nefazodone, paroxetine, sertraline, trazodone ER, venlafaxine, venlafaxine ER, vilazodone, and vortioxetine.
We further agree that Bayesian analyses are sensitive to the choice of the prior distributions when data are scarce. When we used more or less informative priors, the credible intervals again always excluded the null-effect (Table 1). We provide an interactive web-based application that allows the interested reader to adjust the standard deviation of the prior and evaluate the results including diagnostic plots (https://jkaminski.shinyapps.io/antidepressants_suicide_minder/). Overall, we would consider an increased rate of suicide attempts, and possibly also suicides, among those treated with ADs a reliable finding in the Bayesian analysis.
We also discussed that methodological biases should be considered for the interpretation of the results. One potential bias is related to the fact that many patients were on an AD before entering the trial, and randomizing them to the placebo group can induce withdrawal symptoms leading to an inflated suicide (attempt) risk in the placebo-arm. Naturalistic real-world studies have consistently shown that the first few weeks after stopping ADs are a period of increased suicide risk (Coupland et al 2015). Another serious issue are misclassification and misreporting of suicidal events in favor of ADs, revealed by inspecting original documents of the industry (e.g., Sharma et al., 2016). On the other hand, the FDA data analyzed here consists not only of placebo-controlled trials but also includes head-to-head and open-label trials. Attempts to separate the events from the different trials are complicated as the FDA reviews commonly do not report suicides and suicides attempt for each trial separately. We thus hope that the FDA makes a comprehensive dataset publicly available so that researchers can examine these issues.
We agree that we should be skeptical of relying on statistical significance or point-estimates in our analysis. We also need to be skeptical, given the rare occurrence of suicides in the trials, the sensitivity to the different meta-analytic procedures, the method biases, and the pooling of different trial designs. Nonetheless, the analyses consistently hint at an elevated risk for suicide attempts and, less reliably, also for suicides in cohorts of adults. This is remarkable for drugs that are used to treat depressive symptoms.
Martin Plöderl: Formal analysis, Writing – original draft, Writing – review & editing. Michael P. Hengartner: Writing – original draft, Writing – review & editing. Tom Bschor: Writing – review & editing. Jakob André Kaminski: Formal analysis, Writing – review & editing.
JK is supported by the Charité Clinician-Scientist Program of the Berlin Institute of Health.