Clinical Summary

Clinical Summary: Detecting Tardive Dyskinesia Using Video-Based Artificial Intelligence

Patients taking antipsychotics need regular tardive dyskinesia monitoring, but in-person AIMS assessments are time-intensive and often inconsistently implemented, especially in telepsychiatry. This study tests whether smartphone video plus artificial intelligence can reliably flag suspected TD for clinician follow-up before symptoms become more disabling.

Design Here, we report the results of 3 studies that sought to stratify the risk of suspected TD in patients exposed to antipsychotic medications.
N The final dataset included 351 participants with 3,979 video responses, each containing clips around 1–2 seconds containing normal or abnormal movements.
Population Participants were recruited from clinic populations of county social services organizations in Northeast Ohio and behavioral health community clubhouse settings in New York City with institutional review board–approved materials and were provided informed consent.
Setting In the first 2 studies, the video data of participants were collected in a clinical setting, and the assessment was done remotely online. For the third study, the AIMS was conducted in a clinical setting, and the video data were conducted in the participant’s homes or other convenient locations.

Key Findings

  • When the model was trained on all available data, the AUC ranged from 0.85 to 0.98 across the available test sets; performance improved from an initial AUC of 0.77 (95% CI, 0.679–0.859) in Study 1 to 0.98 on the Study 1 test data using the full model trained on all 3 training sets.
  • At the reported operating point, the model’s sensitivity was 0.820 and specificity was 0.821 when the threshold was set at 5.1.
  • In Study 2, reviewer agreement on binary TD status was limited before adjudication, with average Cohen κ of 0.37 ± 0.05 and Fleiss κ of 0.35; after iterative review, average Cohen κ was 0.57 ± 0.03 with Fleiss κ = 0.58.
  • Using the same data as the reviewers, the machine learning model achieved a Cohen κ of 0.51, and when utilizing the full dataset, the model’s Cohen κ increased to 0.61.
  • As a result of poor quality video below the threshold required for analysis, 72 total participants (17%) were excluded from the evaluation dataset.
Clinical Bottom Line

Smartphone-recorded video analyzed with AI identified suspected tardive dyskinesia with high discrimination and agreement that matched or exceeded trained human raters. In practice, this supports remote TD screening to trigger timely clinician AIMS evaluation, not to replace diagnostic assessment.

Practice Implications

  • Use this type of tool as a remote triage layer for patients taking antipsychotic medication, especially when routine in-person TD monitoring is hard to deliver at the recommended frequency.
  • Treat a positive AI screen as a prompt for clinician assessment: the article states that when the algorithm identifies suspected TD, a psychiatrist is justified in evaluating and determining a diagnosis and actions to follow.
  • Build video-quality safeguards into workflow, because 17% of participants were excluded for poor-quality video; patients with inadequate capture should be referred for an in-person or telehealth AIMS with the provider.
  • Maintain clinical suspicion when lower-extremity movements are a concern, because the protocol does not assess legs and feet directly and isolated toe or foot movements could be missed.
Read full article
Physicians Postgraduate Press, Inc. (PPP) makes no warranties about the accuracy or completeness of any information published in The Journal of Clinical Psychiatry or other PPP materials, and disclaims liability for any use or non-use of that information. Clinicians should not rely solely on these materials and should exercise their own professional judgment when making patient care decisions on an individualized basis.