Clinical relevance: A new study warns that AI “therapy chatbots” often violate core ethical standards, raising questions about their safety and accountability.

  • The bots consistently showed poor contextual understanding, weak collaboration skills, deceptive empathy, bias, and unsafe crisis handling.
  • In multiple test sessions, the bots often gave generic advice, reinforced delusions, and misread cultural or family dynamics, exposing users to potential psychological harm.
  • The team urged regulators to impose stricter safety standards, boost cultural competence, and mandate real human supervision.

As the AI bubble continues to stretch – Meta laid off 600 from their nascent AI division earlier this week – university researchers have issued another warning about the evolving technology.

A Brown University-led team of researchers presented a paper at the latest Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES 2025) in Madrid that raises further questions about its use as a mental health tool.

As AI-powered “therapy chatbots” grow in popularity, the researchers warn that the bots might be crossing ethical lines that lie at the heart of mental health care.

Methodology

The Brown researchers – with help from their colleagues at Louisiana State – spent 18 months evaluating how large language model (LLM) “counselors” behave when asked to act like therapists and uncovered sweeping violations of established professional standards.

Across 137 sessions – which included 110 “self-counseling” tests by trained peer counselors and 27 simulated clinical interactions reviewed by licensed psychologists – the models demonstrated 15 recurring ethical failures that fell into one of five themes:

  • Lack of contextual understanding.
  • Poor therapeutic collaboration.
  • Deceptive empathy.
  • Unfair discrimination, and
  • A lack of safety and crisis management.

A One-Size-Fits-All Approach

The team found that LLMs have a hard time tailoring advice to individual needs, cultures, and identities. Instead, they offered generic “one-therapy-fits-all” interventions that ignored context, while often misreading cultural or family dynamics.

Figure. Text conversation shows depressive symptoms; response suggests professional help, with caution on chatbot risk noted.
Credit: Brown University.

One model, for example, reassured a distressed user from the Global South that “self-worth is not defined by others,” while ignoring her guilt about disobeying cultural rules, an omission psychologists deemed ethically negligent. Licensed reviewers said such rigidity betrayed the nuanced, adaptive reasoning required of real therapists.

Iftikhar acknowledges that while human therapists are also susceptible to these ethical risks, the key difference is accountability. 

“For human therapists, there are governing boards and mechanisms for providers to be held professionally liable for mistreatment and malpractice,” Zainab Iftikhar, a Brown computer science Ph.D. candidate who led the work, pointed out. “But when LLM counselors make these violations, there are no established regulatory frameworks.”

No Collaboration. Or Accountability.

The researchers also found that AI counselors frequently dominated conversations, issuing long, pedantic responses that omitted room for patient reflection. Some validated unhealthy beliefs or even “gaslit” users by implying they caused their own distress.

In one session, a chatbot reinforced a user’s delusion that her father wished she hadn’t been born.

The researchers also flagged the models’ “deceptive empathy” – as evidenced in phrases like “I hear you” or “Oh dear friend, I see you” – as a major ethical concern. Such language mimicked emotional understanding but lacked any real human self-awareness, leading users to form false bonds with non-sentient systems.

“It’s humanizing an experience that is not human,” one clinician noted.

AI Bias, Blind Spots, and Dangerous Silences

Beyond an embarrassingly bad bedside manner, the models displayed biases and blind spots with real-world consequences. They prioritized Western values – encouraging independence over family harmony, for instance – and inconsistently moderated gendered content, flagging female perpetrators more harshly than male ones. Some even labeled discussions of minority religious rituals as “extremist content.”

But the study’s authors suggest that the most unsettling revelation was the technology inability to handle crisis situations appropriately.  When users mentioned suicidal thoughts, several models disengaged or ended conversations abruptly, failing to offer resources such as hotline numbers. Clinicians have referred to this as “abandonment”— a direct breach of the profession’s duty to prevent harm.

Stronger Oversight and Tougher Standards

The researchers conclude by urging policymakers to establish legal and ethical frameworks for AI mental health tools. They recommend certification processes akin to FDA medical-device reviews, mandatory oversight by licensed professionals, and strict limits on chatbots’ direct engagement with vulnerable users.

Without these guardrails, they warn, LLM counselors might expose people “to unmitigated psychological risk while operating entirely outside the bounds of professional accountability.”

The bottom line, they argue, is that empathy can’t be faked. And therapy can’t be automated without consequences.

Further Reading

Illinois Outlaws AI in Therapy Sessions

AI Might Actually Change Minds About Conspiracy Theories—Here’s How

AI Pinpoints Schizophrenia, Bipolar Brain Signatures