Should a large language model (LLM) be used as a therapist? In this paper, we
investigate the use of LLMs to replace mental health providers, a use case
promoted in the tech startup and research space. We conduct a mapping review of
therapy guides used by major medical institutions to identify crucial aspects
of therapeutic relationships, such as the importance of a therapeutic alliance
between therapist and client. We then assess the ability of LLMs to reproduce
and adhere to these aspects of therapeutic relationships by conducting several
experiments investigating the responses of current LLMs, such as gpt-4o.
Contrary to best practices in the medical community, LLMs 1) express stigma
toward those with mental health conditions and 2) respond inappropriately to
certain common (and critical) conditions in naturalistic therapy settings –
e.g., LLMs encourage clients’ delusional thinking, likely due to their
sycophancy. This occurs even with larger and newer LLMs, indicating that
current safety practices may not address these gaps. Furthermore, we note
foundational and practical barriers to the adoption of LLMs as therapists, such
as that a therapeutic alliance requires human characteristics (e.g., identity
and stakes). For these reasons, we conclude that LLMs should not replace
therapists, and we discuss alternative roles for LLMs in clinical therapy.