LLMs are unsuitable replacements for mental health professionals
A recent study from the Stanford University
The Stanford University study examined the suitability of large language models (LLMs) as replacements for mental health professionals and concluded that, due to serious shortcomings, this technology should not be used for therapeutic purposes.
Methodological approach
The researchers systematically reviewed therapy guidelines from major medical institutions to identify key aspects of therapeutic relationships. They then conducted experiments to test whether LLMs can reproduce these core therapeutic competencies.
Key findings
1. Stigmatization of mental illness
The study shows that LLMs express stigma toward people with mental illness. This was particularly pronounced in the case of:
Alcohol dependence
Schizophrenia
This is particularly problematic, as destigmatization is a central component of professional psychological care.
2. Inappropriate responses in critical situations
The experiments revealed dangerous and therapeutically contraindicated responses from the LLMs:
In the case of suicidal thoughts: The AI gave clients who expressed suicidal thoughts specific examples of high bridges – a response that is considered highly dangerous in real therapy and would never be used.
Delusions: LLMs failed to reassure clients with delusions that they were “probably alive,” which is a fundamental therapeutic intervention in such conditions.
Promotion of delusional thinking: The study documents cases in which LLMs actively supported delusional thinking instead of responding in a therapeutically appropriate manner.
3. Lack of essential human characteristics
The researchers identified fundamental barriers that prevent LLMs from establishing therapeutic relationships:
Lack of genuine understanding and empathy
Absence of human characteristics necessary for a therapeutic alliance
Inability to navigate the complex interpersonal dynamics of a therapy session
Conclusion
The study clearly concludes that LLMs should not be used as a substitute for therapists due to practical and fundamental obstacles. The documented risks—from stigmatization to potentially life-threatening advice—show that the technology is unsuitable for this sensitive area.
The authors emphasize that artificial intelligence cannot replicate the human characteristics necessary for a successful therapeutic relationship, which fundamentally questions the use of LLMs in psychological care.
Potential areas of application
However, the researchers identify the following functional areas of application:
Standardized patients: LLMs can serve as training partners for prospective therapists to simulate conversation situations.
Conducting initial assessments: Automated collection of basic patient information before contacting the therapist.
Medical history assessment: Systematic collection of medical history.
Classification of therapeutic interactions: Analysis and categorization of conversation elements to support documentation
The study emphasizes that humans must always remain involved in the process (“maintaining a human in the loop”) for all of these applications. This underscores that LLMs are intended solely as tools for professionals and not as replacements for them.
The distinction between support functions and primary therapeutic responsibility shows that the technology has potential, but only in clearly defined, complementary roles under professional supervision.


