Direct Comparison of the Psychometric Properties of Multiple Interview and Patient-Rated Assessments of Suicidal Ideation and Behavior in an Adult Psychiatric Inpatient Sample

J Clin Psychiatry 2015;76(12):1676–1682

Objective: Compare the accuracy, agreement, internal consistency, and interrater reliability of 3 interviews to assess suicidal ideation and behavior in accordance with US Food and Drug Administration guidance about reporting categories.

Method: Adults admitted to a psychiatric inpatient unit (N = 199) completed 3 assessments of past month and lifetime suicidal ideation and behavior—the Columbia Suicide Severity Rating Scale (C-SSRS), the Suicide Tracking Scale (STS), and the Sheehan Suicidality Tracking Scale (S-STS)—in randomized, counterbalanced order. “Missing gold standard” latent class analyses defined categories for ideation and behavior. Analyses also evaluated the S-STS mapping to C-SSRS categories. Three trained judges re-rated 89 randomly selected interview videotapes. Cohen κ, the primary outcome measure, quantified agreement above chance. Data were collected between November 2011 and June 2013.

Results: All 3 assessments showed excellent accuracy for suicidal ideation (κ = 0.72 to 1.00) and attempts (κ = 0.82 to 0.95) calibrated against latent classes. Interrater agreement ranged from κ = 0.52 to 1.00. Interrater agreement about more granular C-SSRS categories varied more widely (κ = 0.48 to 1.00), and the C-SSRS and S-STS assigned significantly different numbers of cases to many categories. Cronbach α was < 0.55 for the C-SSRS ideation and between 0.78 and 0.92 for the other scales.

Conclusions: All 3 assessments showed good accuracy for broad categories of suicidal ideation and behavior. More granular, specific categories usually were rated reliably, but the C-SSRS and S-STS differed significantly in regard to which patients were assigned to these subcategories. Using any of these interviews would improve reliability over unstructured assessment in evaluating suicidal ideation and behavior. Clinical predictive validity of these interviews, and particularly the more granular categories, remains to be shown.