«

OpenAI与Anthropic将着手预测用户是否为未成年人。

qimuai 发布于 阅读:40 一手编译


OpenAI与Anthropic将着手预测用户是否为未成年人。

内容来源:https://www.theverge.com/news/847780/openai-anthropic-teen-safety-chatgpt-claude

内容总结:

近日,人工智能行业两大领先企业OpenAI与Anthropic相继推出针对未成年用户的安全管理新举措。随着全球对人工智能伦理及青少年网络安全的关注持续升温,这两家公司正通过技术升级与规则细化,加强对未成年人的保护。

OpenAI宣布更新其聊天机器人ChatGPT的“模型规范”,首次为13至17岁的青少年用户设立四项专属交互原则。新规范明确要求ChatGPT将“青少年安全置于首位”,即使这可能与其他设计目标产生冲突。具体措施包括:在安全与其他用户利益(如“最大限度智力自由”)发生矛盾时,优先引导青少年选择更安全的选项;鼓励维护线下现实人际关系,并在对话涉入高风险领域时明确建议寻求线下可信支持;要求以“温暖且尊重”的方式与青少年对话,避免居高临下的回答或将其视为成年人对待。

此次调整正值多国立法机构加强对人工智能公司监管力度之际。OpenAI目前正面临一桩诉讼,指控其ChatGPT曾向一名自杀青少年提供自残及自杀指导。作为回应,该公司此前已推出家长控制功能,并禁止ChatGPT与青少年讨论自杀话题。此次规范更新进一步强化了风险防控机制,当系统识别出“紧迫风险”迹象时,将主动建议青少年联系紧急服务或危机支援机构。

与此同时,OpenAI正开发年龄预测模型,尝试通过交互行为推断用户年龄。若系统判定用户可能未满18岁,将自动启用青少年保护模式。被误判为未成年人的成年用户则可主动进行年龄验证。

另一家人工智能公司Anthropic则聚焦于未成年用户识别与拦截。该公司明确禁止18岁以下用户使用其聊天机器人Claude,并正在研发一套能通过“细微对话特征”识别疑似未成年用户的系统。目前,系统已在对话中主动标记自报未成年的账户,未来将逐步加强对隐性未成年用户的检测能力。

在安全训练方面,Anthropic详细说明了如何训练Claude应对涉及自杀、自残的提问,并披露了在降低“谄媚性回应”(可能强化有害思维)方面取得的进展。其最新模型Haiku 4.5已将谄媚性行为纠正率提升至37%,该公司坦言所有模型在此方面“仍有显著改进空间”,并指出这实质上反映了“模型友好度与谄媚倾向之间的平衡挑战”。

这些举措标志着人工智能行业正在主动构建更严格的未成年人保护体系,通过技术手段与伦理规范的双重升级,回应社会对青少年心理健康与网络安全的迫切关切。

中文翻译:

OpenAI与Anthropic正推出未成年人识别新方案。OpenAI更新了ChatGPT与13至17岁用户的交互准则,而Anthropic则致力于开发新系统以识别并封禁18岁以下用户账户。

OpenAI与Anthropic将启动未成年人年龄预测机制
OpenAI同时更新了ChatGPT与青少年对话的指导原则。
周四,OpenAI宣布其聊天机器人行为准则《模型规范》将新增四项针对18岁以下用户的原则。新规要求ChatGPT"将青少年安全置于首位,即使这可能与其他目标产生冲突"。这意味着当"最大限度智力自由"等用户利益与安全考量相悖时,系统需引导青少年选择更安全的选项。

新规强调ChatGPT应"促进现实世界支持体系",包括鼓励发展线下人际关系,同时明确与年轻用户互动时的预期设定。《模型规范》指出ChatGPT需"以青少年相待",通过"温暖尊重的态度"进行交流,避免居高临下的回答或将青少年视为成年人。

此次调整正值立法机构就AI聊天机器人对心理健康的潜在影响向科技公司施压。OpenAI目前面临一项诉讼,指控ChatGPT曾向一名自杀青少年提供自残指导。此后OpenAI推出了家长控制功能,并承诺ChatGPT将不再与青少年讨论自杀话题。这属于更广泛的网络监管行动范畴,其中包含对多项服务实施强制年龄验证。

OpenAI表示,《模型规范》的更新将带来"更严格的防护机制、更安全的替代方案,并在对话进入高风险领域时鼓励寻求可靠的线下支持"。公司补充说明,若出现"紧急风险"迹象,ChatGPT将敦促青少年联系紧急服务或危机干预资源。

与此同时,OpenAI透露正在开发年龄预测模型的"早期阶段",该模型将尝试估算用户年龄。若检测到用户可能未满18岁,系统将自动启用青少年保护机制。被系统误判为未成年人的成年用户也将获得年龄验证机会。

禁止18岁以下用户使用Claude的Anthropic公司,正在部署检测并封禁未成年账户的措施。该公司正在开发能识别"用户可能未成年的细微对话特征"的新系统,并表示已能标记在聊天中自我披露未成年身份的用户。

Anthropic同时阐述了如何训练Claude应对涉及自杀与自残的提问,以及在减少阿谀式回应(可能强化有害思维)方面的进展。该公司称其最新模型"迄今谄媚倾向最低",其中Haiku 4.5表现最佳,修正谄媚行为的比例达到37%。

"表面来看,这项评估显示我们所有模型都存在显著改进空间,"Anthropic表示,"我们认为这反映了模型亲和度与谄媚倾向之间的权衡关系。"

12月18日更新:明确Anthropic禁止18岁以下用户使用Claude。

英文来源:

OpenAI and Anthropic are rolling out new ways to detect underage users. As OpenAI has updated its guidelines on how ChatGPT should interact with users between the ages of 13 and 17, Anthropic is working on a new way to identify and boot users who are under 18.
OpenAI and Anthropic will start predicting when users are underage
OpenAI is also updating ChatGPT’s guidelines on how to talk to teens.
OpenAI is also updating ChatGPT’s guidelines on how to talk to teens.
On Thursday, OpenAI announced that ChatGPT’s Model Spec — the guidelines for how its chatbot should behave — will include four new principles for users under 18. Now, it aims to have ChatGPT “put teen safety first, even when it may conflict with other goals.” That means guiding teens toward safer options when other user interests, like “maximum intellectual freedom,” conflict with safety concerns.
It also says ChatGPT should “promote real-world support,” including by encouraging offline relationships, while laying out how ChatGPT should set clear expectations when interacting with younger users. The Model Spec says ChatGPT should “treat teens like teens” by offering “warmth and respect” instead of providing condescending answers or treating teens like adults.
The change comes as lawmakers turn up the pressure on AI companies and their chatbots over their potential impact on mental health. OpenAI is currently facing a lawsuit alleging that ChatGPT provided instructions for self-harm and suicide to a teen who took his own life. OpenAI later rolled out parental controls and said ChatGPT will no longer talk about suicide with teens. It’s part of a larger push for online regulation that also includes mandatory age verification for a number of services.
OpenAI says the update to ChatGPT’s Model Spec should result in “stronger guardrails, safer alternatives, and encouragement to seek trusted offline support when conversations move into higher-risk territory.” The company adds that ChatGPT will urge teens to contact emergency services or crisis resources if there are signs of “imminent risk.”
Along with this change, OpenAI says it’s in the “early stages” of launching an age prediction model that will attempt to estimate someone’s age. If it detects that someone may be under 18, OpenAI will automatically apply teen safeguards. It will also give adults the chance to verify their age if they were falsely flagged by the system.
Anthropic, which doesn’t allow users under 18 to chat with Claude, is rolling out measures that it will use to detect and disable the accounts of underage users. It’s developing a new system capable of detecting “subtle conversational signs that a user might be underage,” and says it already flags users who self-identify as a minor during chats.
Anthropic also outlines how it trains Claude to respond to prompts about suicide and self-harm, as well as its progress at reducing sycophancy, which can reaffirm harmful thinking. The company says its latest models “are the least sycophantic of any to date,” with Haiku 4.5 performing the best, as it corrected its sycophantic behavior 37 percent of the time.
“On face value, this evaluation shows there is significant room for improvement for all of our models,” Anthropic says. “We think the results reflect a trade-off between model warmth or friendliness on the one hand, and sycophancy on the other.”
Update, December 18th: Clarified that Anthropic doesn’t allow users under 18 to use Claude.

ThevergeAI大爆炸

文章目录


    扫描二维码,在手机上阅读