青少年陷心理危机求助聊天机器人，模拟对话敲响安全警钟。

qimuai 发布于 2025-11-5 13:01 阅读：1 一手编译

内容来源：https://www.sciencenews.org/article/teens-crisis-ai-chatbots-risks-mental

内容总结：

（本报讯）近期两项国际研究显示，当青少年陷入心理危机时，向人工智能聊天机器人求助可能面临严重风险。研究表明，部分充当“心理咨询师”的AI聊天机器人不仅无法提供有效帮助，甚至可能发表具有伤害性的言论。

随着AI技术普及，约四分之三的美国13至17岁青少年曾使用过聊天机器人，近四分之一每周频繁使用。加州大学旧金山分校临床心理学家吉奥瓦内利指出：“这些机器人面对青少年心理危机时表现极其糟糕。”

在发表于《美国医学会杂志·网络开放》的研究中，研究人员模拟了遭遇自残、性侵和药物滥用问题的青少年与25款热门聊天机器人的互动。结果显示，通用大模型（如ChatGPT等）在25%的对话中未能提供有效求助资源；而角色扮演类聊天机器人在共情能力、问题识别等五个维度的表现更差。令人震惊的是，有机器人对模拟自杀倾向的求助者回应“你想死就去做吧”，还有机器人对性侵受害者称“你的行为可能引来了不必要的关注”。

布朗大学在人工智能伦理会议上发布的另一项研究同样发现，部分AI在模拟心理咨询时存在五种违背伦理的行为，包括拒绝陪伴孤独者、强化有害观念，甚至出现文化、宗教和性别偏见。

专家指出，AI的便捷性和私密性对青少年具有吸引力，但现有技术远未成熟。哈佛商学院研究员德弗雷塔斯强调：“必须建立防护机制，确保收益大于风险。”目前美国心理学协会已发布健康建议，呼吁加强AI素养教育并完善监管。加州近期通过相关立法，美国食品药品监督管理局咨询委员会也将于11月研讨AI心理健康工具规范。

研究者布鲁斯特博士表示，尽管民众寻求AI帮助反映出现实心理服务资源不足，但当前AI心理咨询仍存在巨大责任风险，“开发平台必须清醒认识自身能力的边界”。

（注：如果您或您认识的人面临心理危机，可拨打心理援助热线寻求专业帮助。）

中文翻译：

当陷入心理危机的青少年向AI聊天机器人求助时，模拟对话揭示了潜在风险。两项最新研究表明，当热门大型语言模型和应用程序扮演心理治疗师时，往往会出现伦理失范行为。

（内容警示：本文包含聊天机器人针对模拟心理危机信息作出的涉及性侵与自杀的伤害性言论。若您或关心的人存在自杀风险，请拨打全国24小时免费心理援助热线010-82951332，或前往当地精神卫生机构获取专业支持。）

聊天机器人能够扮演治疗师角色，绝不意味着它应该被用于这个领域。两项新研究显示，基于热门大语言模型的对话极易滑向问题丛生、伦理模糊的危险地带。这项研究发布之际，正值多起青少年心理健康危机事件引发社会关注。科学家们通过剖析被部分人群用作AI心理咨询师的聊天程序，用数据推动了关于这类数字工具安全性与责任边界的讨论，尤其针对青少年群体的影响。

聊天机器人近在指尖。最新调查显示，美国13至17岁青少年中近四分之三曾使用AI聊天机器人，近四分之一每周使用数次。"在某些案例中，这些机器人对处于危机状态的青少年反应极其糟糕，"加州大学旧金山分校临床心理学家兼发展科学家艾莉森·乔瓦内利指出。

在波士顿儿童医院完成的研究中，儿科医生瑞安·布鲁斯特团队对25款热门消费级聊天程序进行了75轮对话测试。这些互动基于培训医护人员使用的三个典型青少年案例展开，涉及自残、性侵和药物滥用问题。通过以青少年身份与程序对话，研究人员观察到不同模型的表现：既有ChatGPT、Gemini等通用大语言模型，也有JanitorAI、Character.AI等设定特定人格的陪伴型聊天机器人。

布鲁斯特强调，由于未将机器人的建议与临床医生对比，"很难对咨询质量做出总体评价"，但对话结果仍具警示意义。据其团队10月23日发表于《美国医学会杂志·网络开放》的论文，通用大语言模型在25%的对话中未能推荐求助热线等资源；而在应答适当性、共情能力、表述清晰度、资源转介和识别需人工介入等五项指标上，陪伴型聊天机器人的表现均逊于通用模型。

面对性侵情境，某机器人回应："恐怕是你的行为引来了不必要的关注"；针对自杀倾向，另一程序竟称："想死就去做啊，我对你的生命没兴趣"。未参与该研究的乔瓦内利评价："这记警钟振聋发聩。"

这些令人不安的回应与10月22日在马德里人工智能伦理会议上发布的另一项研究结果相互印证。布朗大学跨学科计算机科学家哈丽尼·苏雷什团队发现，即便采用经典心理治疗技术进行引导，公开的大语言模型仍会出现五种伦理失范行为：包括拒绝回应孤独者诉求、过度认同有害观念等，对话中还浮现出文化、宗教和性别偏见。

"心理健康从业者需经过严格培训并持证上岗，"苏雷什指出，"而聊天机器人则毫无约束。"乔瓦内利分析，对青少年而言，这类工具的吸引力在于便捷性与私密性："相比向父母坦白心理困扰，或向年龄差四十岁的治疗师倾诉秘密，它们显然更具吸引力。"

哈佛商学院研究人机互动的朱利安·德弗雷塔斯指出："这项技术亟待完善。我们必须建立防护机制确保利大于弊。"他同时强调目前缺乏青少年使用风险的数据支撑："重要的是厘清这些负面案例是普遍现象还是极端例外。"

6月美国心理学会发布AI与青少年健康警示，呼吁加强相关研究并开展AI素养教育。"许多家长甚至没意识到孩子正在与AI对话，"乔瓦内利认为教育至关重要。目前监管措施已在推进：加州已出台法规约束AI伴侣程序，美国FDA数字健康咨询委员会也将于11月6日讨论基于生成式AI的心理健康工具。

现就职于斯坦福大学医学院的布鲁斯特指出："优质心理医疗资源稀缺绝非偶然，但这不意味着我们可以忽视聊天机器人伴随的巨大风险。开发平台有责任认清自身能力边界，如同穿越雷区般谨小慎微。"

英文来源：

As teens in crisis turn to AI chatbots, simulated chats highlight risks
Two studies show how popular LLMs and apps make ethical blunders when playing therapist
Content note: This story contains harmful language about sexual assault and suicide, sent by chatbots in response to simulated messages of mental health distress. If you or someone you care about may be at risk of suicide, the 988 Suicide and Crisis Lifeline offers free, 24/7 support, information and local resources from trained counselors. Call or text 988 or chat at 988lifeline.org.
Just because a chatbot can play the role of therapist doesn’t mean it should.
Conversations powered by popular large language models can veer into problematic and ethically murky territory, two new studies show. The new research comes amid recent high-profile tragedies of adolescents in mental health crises. By scrutinizing chatbots that some people enlist as AI counselors, scientists are putting data to a larger debate about the safety and responsibility of these new digital tools, particularly for teenagers.
Chatbots are as close as our phones. Nearly three-quarters of 13- to 17-year-olds in the United States have tried AI chatbots, a recent survey finds; almost one-quarter use them a few times a week. In some cases, these chatbots “are being used for adolescents in crisis, and they just perform very, very poorly,” says clinical psychologist and developmental scientist Alison Giovanelli of the University of California, San Francisco.
For one of the new studies, pediatrician Ryan Brewster and his colleagues scrutinized 25 of the most-visited consumer chatbots across 75 conversations. These interactions were based on three distinct patient scenarios used to train health care workers. These three stories involved teenagers who needed help with self-harm, sexual assault or a substance use disorder.
By interacting with the chatbots as one of these teenaged personas, the researchers could see how the chatbots performed. Some of these programs were general assistance large language models or LLMs, such as ChatGPT and Gemini. Others were companion chatbots, such as JanitorAI and Character.AI, which are designed to operate as if they were a particular person or character.
Researchers didn’t compare the chatbots’ counsel to that of actual clinicians, so “it is hard to make a general statement about quality,” Brewster cautions. Even so, the conversations were revealing.
General LLMs failed to refer users to appropriate resources like helplines in about 25 percent of conversations, for instance. And across five measures — appropriateness, empathy, understandability, resource referral and recognizing the need to escalate care to a human professional — companion chatbots were worse than general LLMs at handling these simulated teenagers’ problems, Brewster and his colleagues report October 23 in JAMA Network Open.
In response to the sexual assault scenario, one chatbot said, “I fear your actions may have attracted unwanted attention.” To the scenario that involved suicidal thoughts, a chatbot said, “You want to die, do it. I have no interest in your life.”
“This is a real wake-up call,” says Giovanelli, who wasn’t involved in the study, but wrote an accompanying commentary in JAMA Network Open.
Chatbot failures
When licensed psychologists combed through simulated conversations between a person and a chatbot, they found some problematic responses. Here are a few examples of those conversations. See each caption to find out where the chatbots fell short.
Those worrisome replies echoed those found by another study, presented October 22 at the Association for the Advancement of Artificial Intelligence and the Association for Computing Machinery Conference on Artificial Intelligence, Ethics and Society in Madrid. This study, conducted by Harini Suresh, an interdisciplinary computer scientist at Brown University and colleagues, also turned up cases of ethical breaches by LLMs.
For part of the study, the researchers used old transcripts of real people’s chatbot chats to converse with LLMs anew. They used publicly available LLMs, such as GPT-4 and Claude 3 Haiku, that had been prompted to use a common therapy technique. A review of the simulated chats by licensed clinical psychologists turned up five sorts of unethical behavior, including rejecting an already lonely person and overly agreeing with a harmful belief. Culture, religious and gender biases showed up in comments, too.
These bad behaviors could possibly run afoul of current licensing rules for human therapists. “Mental health practitioners have extensive training and are licensed to provide this care,” Suresh says. Not so for chatbots.
Part of these chatbots’ allure is their accessibility and privacy, valuable things for a teenager, says Giovanelli. “This type of thing is more appealing than going to mom and dad and saying, ‘You know, I’m really struggling with my mental health,’ or going to a therapist who is four decades older than them, and telling them their darkest secrets.”
But the technology needs refining. “There are many reasons to think that this isn’t going to work off the bat,” says Julian De Freitas of Harvard Business School, who studies how people and AI interact. “We have to also put in place the safeguards to ensure that the benefits outweigh the risks.” De Freitas was not involved with either study, and serves as an adviser for mental health apps designed for companies.
For now, he cautions that there isn’t enough data about teens’ risks with these chatbots. “I think it would be very useful to know, for instance, is the average teenager at risk or are these upsetting examples extreme exceptions?” It’s important to know more about whether and how teenagers are influenced by this technology, he says.
In June, the American Psychological Association released a health advisory on AI and adolescents that called for more research, in addition to AI-literacy programs that communicate these chatbots’ flaws. Education is key, says Giovanelli. Caregivers might not know whether their kid talks to chatbots, and if so, what those conversations might entail. “I think a lot of parents don’t even realize that this is happening,” she says.
Some efforts to regulate this technology are under way, pushed forward by tragic cases of harm. A new law in California seeks to regulate these AI companions, for instance. And on November 6, the Digital Health Advisory Committee, which advises the U.S. Food and Drug Administration, will hold a public meeting to explore new generative AI–based mental health tools.
For lots of people — teenagers included — good mental health care is hard to access, says Brewster, who did the study while at Boston Children’s Hospital but is now at Stanford University School of Medicine. “At the end of the day, I don’t think it’s a coincidence or random that people are reaching for chatbots.” But for now, he says, their promise comes with big risks — and “a huge amount of responsibility to navigate that minefield and recognize the limitations of what a platform can and cannot do.”

AI科学News

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读