Clinical relevance: A new study finds that AI ambient scribes create richer mental health notes but are linked to fewer diagnoses, prescriptions, and referrals.

  • AI ambient scribes are spreading fast in U.S. primary care, promising lighter documentation loads and more face time with patients.
  • AI-documented visits were less likely to include a depression diagnosis, medication, or behavioral health referral.
  • The findings complicate the hype surrounding AI, suggesting better documentation alone doesn’t guarantee better mental health care.

(Editor’s note: Nearly every day the news headlines proclaim how AI will change our lives – for better or worse. So consider this your latest update.)

AI-powered “ambient scribes” across the United States are now listening in on doctor-patient conversations. Providers are using these virtual assistants to automatically generate clinical notes in a bid to streamline documentation while also freeing up more face time with patients.

Now new research reveals that when it comes to mental health care, these high-tech note-takers might be revolutionizing more than just clinical workflow. In a large retrospective analysis appearing in JAMA Psychiatry, researchers report that primary care visits documented with AI ambient scribes contain more detailed descriptions of psychiatric symptoms. However, they’re also less likely to result in concrete action.

Methodology

The study, led by investigators at Massachusetts General Hospital and Harvard Medical School, analyzed more than 20,000 routine annual primary care visits from two major New England academic health systems. Roughly a quarter of those visits included the use of an AI ambient scribe. The rest relied on either human scribes, no scribe at all, or documentation from the year before the introduction of AI scribes.

The researchers carefully matched visits across demographic groups and depression diagnoses. Patients looked a lot alike on paper: average age was 48, about 60% were women, and roughly 5% met the criteria for moderate or worse depression.

But the startling differences showed up in the notes.

More Than Dictation

Using a large language model to analyze the reams of narrative documentation, the research team assessed symptom severity across six domains – as defined by the National Institute of Mental Health’s Research Domain Criteria framework. Across every domain, AI-scribed notes scored dramatically higher than those transcribed by humans or even clinicians working on their own.

Simply put, AI-generated notes painted a fleshed-out portrait of patients’ psychological states. Oddly enough, though, that added insight didn’t exactly translate into more (or even better) treatment.

Visits that the AI “secretaries” transcribed appeared to be less likely to include a depression diagnosis code, antidepressant prescription, or referral for behavioral health care. Less than 15%of AI-scribed visits included at least one of those interventions, compared with roughly 17% of visits documented by human (or no) scribes.

After adjusting for sociodemographic and clinical factors, AI-scribed visits still showed a much lower likelihood of psychiatric intervention. Human scribes, by contrast, weren’t associated with a similar drop.

Making Sense of What They Found

The finding raises an interesting (if unsettling) question: If we can document mental health symptoms more thoroughly now, why isn’t it moving the needle on care?

The researchers don’t exactly answer that question, at least not directly. But they do offer up a few theories.

One possibility is that automation influences clinician engagement, however subconsciously. It’s something the authors mentioned that’s shown up in other fields that rely on automation. According to this school of thought, when the system’s doing more of the work, humans tend to do less.

Another explanation isn’t quite as damning. AI scribes, they suggest, might be simply documenting symptoms that clinicians already recognize as subthreshold or situational. Or they might surface emotional content that feels clinically familiar but doesn’t rise to the level of actionable information in a time-limited annual visit.

Still, the pattern is striking, especially given long-standing efforts to boost depression recognition and treatment in primary care settings.

What Else?

What the researchers uncovered also throws into question a common narrative surrounding AI scribes. Much of the existing research has centered on clinician burnout, time spent on electronic health records, and job satisfaction. All of these produced generally positive results. Far fewer studies have asked whether AI-mediated documentation influences patient care.

Even so, as ambient AI scribes gain widespread acceptance – while still avoiding the attention of regulators – the findings offer a timely reminder. Better documentation doesn’t necessarily mean better care. Sure, AI might be getting better at documenting what patients say. But whether it helps clinicians do their jobs remains up in the air, at least for now.

Further Reading

Teens Are Turning to AI for Support. A New Report Says It’s Not Safe.

Detecting Tardive Dyskinesia Using Video-Based Artificial Intelligence

AI Might Actually Change Minds About Conspiracy Theories—Here’s How