This work may not be copied, distributed, displayed, published, reproduced, transmitted, modified, posted, sold, licensed, or used for commercial purposes. By downloading this file, you are agreeing to the publisher’s Terms & Conditions.

Clinical and Practical Psychopharmacology

A Primer on How to Critically Read an Observational Study on Adverse Medical Outcomes Associated With Long-Term Antidepressant Drug Use

Chittaranjan Andrade, MD

Published: December 7, 2022


Whether long-term antidepressant use predisposes to or protects against adverse medical outcomes is unclear. In this context, a recent retrospective cohort study found that, for example, at a 10-year follow-up, selective serotonin reuptake inhibitors lowered the risk of diabetes mellitus and hypertension but raised the risk of cerebrovascular disease, cardiovascular mortality, and all-cause mortality. The findings of this study were widely and uncritically covered in the lay and medical media with potential to adversely impact opinions about antidepressant treatment among patients, caregivers, and health care professionals. This article critically evaluates the study with a view to discuss its limitations and, more importantly, to arm the reader with skills to critically appraise other, similar studies. Concepts explained include confounding by indication, regression, and approaches to deal with confounding. Problems with the study identified and explained are incomplete adjustment for confounding, failure to correct for multiple hypothesis testing, the use of backward stepwise regression as a method of analysis, failure to consider reverse causation, and failure to remove death by suicide from analyses of all-cause mortality. Other limitations of the study are also discussed. A take-home message is that it is well established that depression is associated with substantial disability and risk of suicide and that antidepressant drugs treat depression and prevent relapse and remission; in contrast, no causal role for antidepressants in long-term adverse medical outcomes is established. Therefore, known long-term benefits with antidepressants must be weighed against unproven predispositions to long-term medical adverse effects in shared decision-making processes.

J Clin Psychiatry 2022;83(6):22f14733

To cite: Andrade C. A primer on how to critically read an observational study on adverse medical outcomes associated with long-term antidepressant drug use. J Clin Psychiatry. 2022;83(6):22f14733.
To share:

© 2022 Physicians Postgraduate Press, Inc.



The adverse effects of antidepressant drugs that are well described in terms of nature and frequency of occurrence are those that are common and those that appear early; such adverse effects are easily identified in randomized controlled trials (RCTs). As well-known examples, tricyclic antidepressants are associated with dry mouth and constipation, selective serotonin reuptake inhibitors (SSRIs) with nausea and anorgasmia, serotonin-norepinephrine reuptake inhibitors (SNRIs) with insomnia and increased blood pressure, and mirtazapine with increased sleep and appetite.1

Adverse effects of antidepressants that are less well described are those that are uncommon and those that appear late. These are seldom identified in RCTs because most RCTs are short in duration, do not systematically assess for uncommon events, and seldom have sufficiently large samples in which uncommon events may be detected. Examples of antidepressant adverse effects that are poorly described are emotional numbing, “brain zaps,” nightmares, and myoclonic jerks.

Finally, adverse effects of antidepressants that are poorly understood are possible predispositions to adverse medical outcomes of multifactorial origin, such as diabetes mellitus (DM), hypertension (HT), coronary heart disease (CHD), cerebrovascular disease (CVD), and mortality. Antidepressant RCTs do not have sufficiently large samples nor are they sufficiently long in duration for such predispositions to be identified. Whereas such predispositions can and have been examined in observational studies, it is almost impossible to control for confounding when assignment to antidepressant treated and untreated groups is not random; this is explained in a later section.

Scope of This Article

This article discusses a recent observational study2 on long-term adverse medical outcomes associated with antidepressant drug treatment. The authors of the study concluded in the abstract of their paper and elsewhere in their text that their findings indicated an association between long-term antidepressant use and elevated risks of CHD, cardiovascular (CVS) mortality, and all-cause mortality. These conclusions will trouble patients and medical professionals alike because long-term antidepressant use is frequently necessary in patients with conditions such as anxiety, depression, and obsessive-compulsive disorder. The findings of the study and the conclusions of the authors were widely disseminated through both lay and medical mass media with little to no attention drawn to the limitations of the study.

The present article provides a critical appraisal of the study with a view to help readers gain skills in the identification of common limitations of observational research as well as limitations specific to the study in point. It is hoped that these skills will be used to address the reader’s own doubts in addition to doubts and concerns expressed by patients and by other health care professionals who read this study and other, similar, observational studies.

Before proceeding further, readers are encouraged to form their own impressions after going through at least the abstract if not the full text of the study.2 Readers are also encouraged to examine the supplementary materials to the study; there is an unexpected twist to the story that will be found there. The study was published as an open access article and the full text is available free to all.

Brief Background

Depression may predispose to adverse medical outcomes through disturbances in sleep and appetite that result in changes in physical activity levels and body weight; through smoking, drinking, and use of illicit substances; through other unhealthy ways of coping with stress; through poor adherence to medical advice and prescriptions for medical conditions; through autonomic, immune, and inflammatory dysfunction; and through other mechanisms.3 Logically, therefore, antidepressant drugs should reduce depression-related risk of adverse medical outcomes by resolving depression.

Unfortunately, antidepressant drugs do not always resolve depression. Furthermore, habits or physical changes or medical conditions that develop during depression do not necessarily reverse when depression remits. Consequently, even remitted depression may be associated with risk factors that predispose to adverse medical outcomes. So, because antidepressant drugs are continued even in patients who show partial response, and beyond remission in patients with remitted depression, and in the long-term in patients with recurrent depression, antidepressant drugs are associated by proxy with these depression-related risk factors and hence with adverse medical outcomes. This is known as confounding by indication: the association between antidepressant use and adverse medical outcomes is confounded by the indication for which antidepressants are prescribed. Simply stated, depression is the confounding variable that may explain why antidepressant exposure may be linked to adverse medical outcomes.

But antidepressants cannot be so easily exonerated. Some antidepressants, such as mirtazapine, may increase sleep and appetite and predispose to the cardiometabolic syndrome through decreased physical activity and weight gain. Some antidepressants, such as the SNRIs, may directly increase heart rate and blood pressure and predispose to cardiovascular disease. Some antidepressants, such as the tricyclics, may block ion channels and predispose to cardiac arrhythmias. Some antidepressants may display adverse drug interactions, diminishing the effects of drugs used to treat CVS disease. Other mechanisms are also described.3

How can we separate the risks associated with depression from those associated with antidepressant use? The best way is to perform an RCT in which depressed patients are randomized to receive antidepressant drug treatment vs, say, cognitive-behavior therapy with follow-up for, say, 5–10 years during which time a sufficient number of medical events would have occurred for meaningful statistical analysis. Unfortunately, such a study would be logistically challenging to conduct. Researchers therefore employ observational designs. For example, they may conduct case-control studies of patients with and without an adverse outcome to determine who had vs had not been exposed to antidepressant drugs. Or, they may conduct cohort studies in which patients exposed vs unexposed to antidepressant drugs are followed for many years to determine in whom adverse outcomes do vs do not occur. Data for such observational studies are most commonly extracted from insurance, health care, national register, or other electronic databases.4,5

Addressing Confounding and Other Unbalanced Risks in Observational Studies

In RCTs, at baseline, the process of randomization tends to balance between antidepressant-treated and untreated groups not only risk factors for the adverse outcome that are related to the diagnosis (confounding variables) but also all other risk factors, some of which may have been measured and recorded in the study, some of which may not have been measured, and the rest of which are unknown.6 So, at the RCT endpoint, it is reasonably valid to directly compare the incidence of the adverse outcome between antidepressant-treated and untreated groups.

In observational studies, participants are not randomized to the groups of interest. So, at baseline, risk factors are not balanced between the groups of interest as they are in RCTs. Researchers use one or more ways to deal with this problem. A method invariably employed is to statistically adjust for measured risk factors using regression analysis (discussed later). For example, the effect of antidepressant vs no antidepressant exposure on an adverse medical outcome can be examined after adjusting for age, sex, socioeconomic status, diet, level of physical activity, cardiometabolic characteristics at baseline, and so on. Note that how good the statistical adjustment is will depend on how well these risk factors are measured. So, if baseline smoking is operationalized as smoker vs nonsmoker (rather than measured in terms of the number of cigarettes smoked per day and years of smoking), the statistical adjustment for baseline smoking will be poor. Note also that risk factors that are not measured and risk factors that are not known cannot be adjusted for, making the adjustment for extraneous risk factors even poorer in observational studies.

A better way of addressing risk factors that are unbalanced between antidepressant-treated vs untreated groups is to perform propensity score matching wherein, regardless of actual grouping, regression analysis is used to identify characteristics of study participants who did vs did not receive antidepressant drugs. The results of the regression can then be used to produce a number (the propensity score) that tells us how likely each study participant is to belong to the treated vs untreated group regardless of the actual treatment status. In the next step, each participant who received antidepressant treatment is matched with a participant who did not receive the antidepressant but who had the same or closely similar propensity score as the participant who did receive the antidepressant. The sample size drops considerably because many treated vs untreated participants may have propensity scores that cannot be matched. In the final step, the matched participants are compared for risk of the adverse outcome. Unfortunately, although propensity score matching helps balance (between treatment groups) measured risk factors for the adverse outcome, it will not properly balance for inadequately measured risk factors, and it cannot balance for unmeasured and unknown risk factors. A further problem is that those who are matched may not be representative of their respective groups.7

As a specific way of addressing confounding by indication, adverse medical outcomes can be examined only in depressed patients, comparing those who did vs did not receive antidepressant treatment; unfortunately, the problem now shifts from confounding by indication to confounding by indication severity because more severely depressed patients are more likely to have received antidepressant drugs and because more severe depression may predispose more greatly to worse medical outcomes.

Another way of specifically addressing confounding by indication is to examine whether, among depressed patients treated with antidepressant drugs, higher doses are associated with greater predisposition to adverse outcomes; unfortunately, because higher doses are more likely to be prescribed to patients with more severe depression, and to those who respond poorly to treatment, the problem of confounding by indication severity remains.

These approaches to deal with unbalanced risk factors are often combined. There are other ways, also, to address confounding by indication. Nevertheless, nothing can adjust for risk factors that are inadequately measured, unmeasured, and unknown. To this extent, the results of observational studies must be viewed with caution; significant findings should be regarded as associations between risk factors and outcomes, and not as cause-effect relationships.

Risk Factors, Covariates, and Confounders

A risk factor is any variable that increases the risk of an outcome. So, depression is a risk factor for suicide. A covariate is any variable that is adjusted for in an analysis, as explained in the previous section. A confounding variable is any variable that influences both grouping variable and outcome, resulting in a false association between the grouping variable and outcome; this was also explained in an earlier section.

In real life, things are messy. Depression may be a risk factor for suicide, but the risk may really be driven by mediating variables (such as anxiety, hopelessness, and external stressors that cause or result from depression) not all of which may be measured even if the severity of depression is measured. Poor dietary habits and alcohol intake may be driven by depression but may also be socioeconomically or culturally driven; they are therefore both confounding variables and independent risk factors for the relationship between antidepressant exposure and adverse medical outcomes. Because precise relationships can be difficult to disentangle, all variables being adjusted for are sometimes lumped into a single category that is referred to as “covariates” by some authors or “confounders” by others.

Previous Research

Coming to the study by Bansal et al,2 this study was not conducted in a vacuum; many previous studies have examined adverse medical outcomes associated with antidepressant drug use. For example, in a systematic review and meta-analysis, Wang et al8 found that, among antidepressants, only the tricyclics were associated with an increased risk of new-onset DM. In another systematic review and meta-analysis, SSRI exposure was found to be associated with an increased risk of new-onset ischemic as well as hemorrhagic stroke.9 A meta-analysis by Maslej et al10 found that antidepressants were associated with an increased risk of adverse CVS events as well as all-cause mortality, but, surprisingly, not in patients with preexisting CVS disease; a possible interpretation is that confounding may drive the association in persons without preexisting CVS disease.

These and other studies suggest that there are indeed statistically significant associations between antidepressant exposure and adverse medical outcomes. What is unclear is whether the associations are cause and effect or driven by other risk factors. This sets the stage for an examination of the study by Bansal et al.2

Bansal et al (2022): Methods and Results

Bansal et al2 identified 222,121 subjects in the UK Biobank, aged 40–69 (median, 58) years, none of whom were using psychotropic or cardiometabolic drugs and none of whom had a cardiometabolic diagnosis at baseline. They2 recorded details about antidepressant initiation, dosing, and total exposure. They followed patients and assessed adverse medical outcomes after 5 and 10 years of follow-up. At each of these 2 time points, they assessed whether all antidepressants, (only) SSRIs (mostly commonly citalopram), and (only) other antidepressants (most commonly mirtazapine) were associated with new onset DM, HT, CHD, and CVD or with CVS mortality and all-cause mortality. They additionally examined the association of average daily dose with adverse medical outcomes. In this context, average daily dose was operationalized in terms of defined daily dose units and was classified as low, intermediate, and high.

In various analyses, based on the outcome examined and the follow-up time point, antidepressants were observed to have been prescribed for 6%–8% of the sample. There were about 120,000–150,000 subjects in each analysis. The data were analyzed using backward stepwise Cox proportional hazards regression. The analyses were adjusted for confounding variables that included age, sex, sociodemographic and socioeconomic variables, smoking status, alcohol intake status, physical activity, body mass index, waist-hip ratio, parental history of the medical outcomes of interest, cardiometabolic laboratory parameters, and concurrent illness or disability. Ethnicity, an important risk factor, was not included for adjustment presumably because 96% of the sample was White.

Important findings from the study are summarized in Table 1, Table 2, and Table 3. The study methods and findings are critically examined in the sections that follow.

Contradictory Findings as a Red Flag

The study2 threw up many confusing and even contradictory findings. For example, at 10 years (Table 2), SSRIs were associated with a significantly reduced risk of DM and HT and with a significantly increased risk of CVD, CVS mortality, and all-cause mortality. These results are perplexing. DM and HT are major risk factors for CVD, CVS mortality, and all-cause mortality, and so reduction in the risk of the former should have resulted in reduction, not increase, in the risk of the latter. Superficially, it may seem that these contradictory findings may have occurred in different sets of patients; however, if so, there should have been a decrease in the risk of CVD and CVS/all-cause mortality in patients in whom the risk of DM and HT decreased, cancelling the increase in risk in the other patients, leading to no increase in the net risk.

There are 5 serious statistical and methodological concerns in this study, any or all of which may explain the contradictory findings. These concerns are incomplete adjustment for confounding, failure to correct for multiple hypothesis testing, the use of backward stepwise regression as a method of analysis, failure to consider reverse causation due to the study of multiple related outcomes, and failure to remove death by suicide from analyses of all-cause mortality. Each of these concerns, in other avatars, may also be relevant to other observational studies. Each of these is considered in turn.

Incomplete Adjustment for Confounding

In the study,2 the groups compared were formed from participants who had vs had not received antidepressant drugs; participants had not been randomized to their respective groups, and there was no comparability for psychiatric diagnoses demonstrated between groups. Thus, the study findings are suspect because of possible confounding by indication. The analyses of dose-dependent effects (Table 3) did not help because they may have been confounded by severity of depression, and also because, in these analyses, other perplexing results were discovered, such as that SSRIs significantly reduced the risk of DM and increased the risk of CVS mortality, but only at subtherapeutic doses; through what mechanism could these findings have possibly arisen?

Some key risk factors were perhaps inadequately measured. As an example, smoking was operationalized as “ever smoked,” yes or no; such operationalization does not capture the magnitude and duration of the risk factor. Other risk factors were not measured and adjusted for; these included variables such as baseline dietary characteristics and LDL-cholesterol level. So, the findings of the study could have been rendered spurious because of inadequately measured, unmeasured, and unknown confounds.

Multiple Hypothesis Testing

When studies have a large number of objectives, it is usual to set one as the primary objective and the rest as secondary objectives; this reduces the risk of a Type 1 (false positive) statistical error associated with multiple hypothesis testing.11 If no primary objective is stated and a large number of exploratory statistical analyses are run, it is desirable to correct for multiple hypothesis testing.12 In the study by Bansal et al,2 even if the reader examines only the fully adjusted models, there are a staggering 90 regressions for which results are presented. No correction for multiple hypothesis testing was applied. It is possible that many of the significant results were false positive results, and that the contradictory significant findings were merely extreme results, in both directions, arising in a random distribution of P values.

Regression: A Brief Background

A brief note on regression is presented as a prelude to the discussion on the backward stepwise procedure applied in the study.2 Regression is a statistical analysis that quantifies the relationship between variables. For example, using available data, linear regression can be used to derive a simple equation that predicts what the height of a child may be given the value of the child’s age.

Multivariable linear regression is a little more complicated. It uses many independent variables to predict the value of a single dependent variable. For example, using available data, an equation can be developed that uses age, sex, ethnicity, and socioeconomic status to predict what the height of a child may be. If we are only interested in the effect of age on height, then age is the variable of interest and the rest of the variables are the independent variables that are “adjusted for.”

Linear regression is used when the outcome variable is a continuous variable. Height, systolic blood pressure, and Hamilton Depression Rating Scale scores are examples of continuous variables; they are measured along a ratio scale. Logistic regression is used when the outcome variable is dichotomous, as in adverse medical outcome “happened” vs “did not happen.” Cox proportional hazards regression is used to factor in the time of occurrence of the dichotomous outcome, as in how early during the course of the study the outcome occurred, if it did occur.

Multivariable regression can be conducted in many ways. For example, all the independent variables can be used to predict the dependent variable regardless of the statistical strength of association between each of the different independent variables and the single dependent variable. This is the ideal method provided that there is a priori reason to believe that the independent variables entered into the regression have the potential to influence the dependent variable. Forward and backward stepwise regression are other ways in which multivariable regression can be conducted. There are other methods, too; these are not discussed here.

In forward stepwise regression, the independent variable with the strongest statistical association with the dependent variable is entered first into the equation. After reexamining relationships between the remaining independent variables and the remaining variance in the dependent variable, the independent variable that now has the strongest statistical association with the dependent variable is entered next into the equation. The procedure is repeated until (whichever happens first) either all the independent variables are in the equation or no more independent variables are significantly associated with the remaining variance in the dependent variable.

In backward stepwise regression, first all the independent variables are entered into the equation, and then variables are removed, one at a time, starting with the variable with the weakest association with the dependent variable. Along the way, variables that have been dropped are reexamined to see if they should be put back into the equation. This continues until (whichever is first) all the variables are removed or no more variables can be removed because all the remaining variables significantly explain the dependent variable and removal of any one of them will decrease the explanatory power of the regression equation.

Note that regression can never establish cause and effect. For example, children who have more teeth have a more extensive vocabulary, but that does not mean that a child’s vocabulary depends on teeth. So, all that regression may establish is an association; for example, between number of teeth and vocabulary, or between long-term antidepressant drug exposure and certain medical outcomes.

The Problem With Stepwise Procedures

Bansal et al2 used backward stepwise regressions to adjust for risk factors in their study of the relationship between antidepressant exposure and adverse medical outcomes. The use of stepwise regression is poor in science12,13 because, for reasons that are beyond the scope of this article to explain, stepwise regression involves a large number of iterative processes (running in the background, unknown to users of statistical software) that attempt to progressively improve the fit, leading to a vastly inflated Type 1 (false positive) error risk. Readers may consider the analogy below.

Imagine that we are looking at a cloud. We remove a little fluff from here and a little fluff from there, and, soon, hey, presto, we see a dragon in that cloud. If we find that a piece that we took out earlier should be put back to make the cloud more dragon-like, we do so and the dragon looks even more real. Likewise, in backward stepwise regression, when we, little by little, remove the inconvenient variables that are not statistically significant, what is left behind is something that looks a lot more like a dragon than it originally did.

So, why is this a problem; perhaps there truly is a dragon hidden in that cloud. The answer is that the cloud is the sample that represents the population and not the population, itself. When removing fluff from the cloud, we emphasize the peculiarities of the cloud, creating a dragon that we would find only in that cloud and not necessarily in other clouds that represent the population, nor in the population, itself. In technical words, the variables that were entered in the regressions by Bansal et al2 were not fluff. They had been entered because there was a priori reason to expect that they were meaningful confounders. If they were removed as fluff merely because in the analyzed datasets they looked like fluff, then that was an unwitting distortion of the reality that was peculiar to their datasets, resulting in the creation of models, peculiar to their datasets, in which variables showed stronger fit than was true in the population.

Reverse Causation Due to the Study of Multiple Related Outcomes

The 6 adverse medical outcomes studied by Bansal et al2 are all related. For example, a patient with DM is at higher risk of CHD, and a patient with CHD is at higher risk of CHD mortality and all-cause mortality. Patients with DM and depression may be prescribed an SSRI because SSRIs are not expected to adversely impact glycemic control. Patients with CHD and depression are commonly prescribed SSRIs because SSRIs may improve CVS outcomes.3 So, if SSRIs were associated with an increased risk of CVD, CHD mortality, and all-cause mortality (Table 2), it may be because these drugs were preferred for patients who were already at increased risk of these adverse medical outcomes because they had developed DM or CHD. Thus, SSRIs would have been associated with the adverse medical outcomes by reverse causation; that is, SSRIs did not “cause” the adverse outcome, but awareness of the risk of the adverse outcome “caused” the preference for SSRIs over other antidepressants.

This problem could easily have been avoided had the authors excluded from analyses of other outcomes all study participants who had first experience of one outcome.

Suicide as Part of All-Cause Mortality

As explained earlier in this article in the discussion on confounding by indication, depression may be associated with behaviors that predispose to adverse outcomes and antidepressants appear associated with these adverse outcomes only because antidepressants are used to treat depression. In this context, suicidal ideation and behavior resulting from depression may have spuriously inflated the risk of all-cause mortality associated with antidepressant treatment. Bansal et al2 did not exclude suicide from all-cause mortality in their analyses.

Was This Study Really About Long-Term Antidepressant Treatment?

As a final twist in the story, the authors of the study2 repeatedly emphasized in various places in their abstract and text that their study was about long-term antidepressant use, but nowhere in their abstract or text did they state the mean or median duration of antidepressant exposure. The reader, by default, will therefore assume that the 5- and 10-year data refer to durations of antidepressant exposure. However, this cannot possibly be so because, in their supplementary materials, the authors presented sensitivity analyses that excluded an unstated number of subjects with short-term (< 90 days) antidepressant use. If the main analyses included subjects with < 90 days of antidepressant exposure, the main analyses cannot possibly have examined adverse outcomes associated with long-term antidepressant treatment.

Take-Home Message

Depression is associated with risk of suicide and with impairment in subjective well-being, quality of life, and work performance. Depression is also associated with changes in lifestyle behaviors that are adverse to physical and mental health. The place of antidepressants in the treatment of depression and in the prevention of relapse and recurrence is well-established. A causal role for antidepressants in long-term adverse medical outcomes is not established. Mental health professionals need to keep all of this in mind in discussions with patients and in shared decision-making processes.

Published online: December 7, 2022.
Acknowledgments: This manuscript benefited from helpful suggestions provided by David L. Streiner, PhD, FCAHS, CPsych (Ret.), Emeritus Professor, McMaster University, Department of Psychiatry & Behavioural Neurosciences; Professor, University of Toronto, Department of Psychiatry; Fellow, Canadian Academy of Health Sciences, Canadian Psychological Association, American Psychological Association, and Society for Personality Assessment.

Each month in his online column, Dr Andrade considers theoretical and practical ideas in clinical psychopharmacology with a view to update the knowledge and skills of medical practitioners who treat patients with psychiatric conditions.

Department of Clinical Psychopharmacology and Neurotoxicology, National Institute of Mental Health and Neurosciences, Bangalore, India (

Financial disclosure and more about Dr Andrade.

Volume: 83

Quick Links: Depression (MDD) , Research Methods Statistics , Side Effects-Medication


Sign-up to stay
up-to-date today!


Already registered? Sign In

Case Report

Safety and Tolerability of Concomitant Intranasal Esketamine Treatment With Irreversible, Nonselective MAOIs: A Case Series

Three cases suggest that concomitant use of intranasal esketamine with an irreversible, nonselective MAOI is safe in...