OpenAI发布GPT-5.2版本，应对“红色警报”危机。

qimuai 发布于 2025-12-12 08:01 阅读：52 一手编译

内容来源：https://www.wired.com/story/openai-gpt-launch-gemini-code-red/

内容总结：

美国人工智能公司OpenAI近日正式推出其迄今为止最强大的AI模型GPT-5.2系列。该模型在写作、编程和逻辑推理等多项基准测试中表现显著提升，被公司定位为"日常专业应用的最佳模型"。

此次发布正值OpenAI面临激烈市场竞争之际。就在几天前，公司首席执行官萨姆·奥尔特曼内部宣布进入"红色警报"状态，要求集中资源提升ChatGPT产品竞争力。OpenAI应用业务负责人菲吉·西莫在发布会上表示，公司已全面增加对ChatGPT的资源投入，但强调GPT-5.2的发布计划早已筹备数月。

目前OpenAI正面临来自谷歌等科技巨头的强劲挑战。谷歌最新推出的Gemini 3模型获得业界好评，其Gemini应用月活用户已突破6.5亿，而ChatGPT的周活用户为8亿。竞争压力迫使OpenAI暂缓部分雄心勃勃的计划，包括在ChatGPT中引入广告的项目，转而聚焦核心技术与产品优化。

GPT-5.2系列包含三个版本：响应更快的"即时版"、擅长编程与数学的"思考版"以及性能最强的"专业版"。据公司测试数据显示，"思考版"在44个现实职业场景的基准测试中，超过70%的任务表现优于人类专家，且完成任务速度快11倍。同时，该模型在事实性问题回答中的"幻觉"现象较前代减少38%。

值得注意的是，OpenAI在提升模型性能的同时，也在应对人工智能伦理挑战。公司表示已加强模型对自我伤害、心理健康危机等敏感话题的应对能力，并开始在部分国家测试年龄预测系统，为未成年用户自动启用内容保护。西莫透露，公司计划在2026年第一季度推出"成人模式"，允许成年用户进行更开放的对话。

尽管基准测试成绩亮眼，但用户体验仍是关键考验。今年早些时候GPT-5发布时就曾因回答过于机械引发用户不满，公司不得不在几天后紧急发布更新。如何在保持对话趣味性与避免过度迎合之间取得平衡，成为OpenAI面临的重要课题。

随着竞争日益激烈，OpenAI内部文件显示公司正面临"前所未有的竞争压力"，并设定了在2026年前将日活用户提升5%的目标。在技术突破与用户体验的双重挑战下，这场人工智能竞赛已进入白热化阶段。

中文翻译：

OpenAI正式推出迄今最智能的人工智能模型GPT-5.2，该模型在写作、编程和逻辑推理等测试基准上均实现性能跃升。此次发布恰逢首席执行官萨姆·奥尔特曼数日前在内部启动"红色警报"——面对竞争对手的激烈攻势，公司正全力推动ChatGPT的升级优化。

OpenAI应用业务首席执行官菲吉·西莫周四在媒体简报会上表示："我们发布红色警报，旨在向全公司明确传递集中资源攻坚特定领域的信号，这是定义优先事项的重要方式。整体而言，我们已为ChatGPT投入了更多资源。"西莫否认红色警报促使GPT-5.2提前发布，强调该模型的发布筹备已持续数月，但承认额外调配的资源对ChatGPT优化"确有助益"。

尽管2022年ChatGPT问世时OpenAI的模型与产品堪称业界标杆，如今这一地位已非铁板钉钉。这家初创企业正面临诸多强劲挑战者，其中威胁最大的当属谷歌——其近期推出的Gemini 3模型获得科技界广泛好评。过去一年谷歌Gemini应用用户量呈爆发式增长，月活跃用户已突破6.5亿，而OpenAI的周活跃用户为8亿。竞争压力迫使OpenAI暂缓部分雄心勃勃的计划（包括为ChatGPT引入广告），转而聚焦核心技术与产品升级。

与近期其他模型发布策略相似，GPT-5.2以系列模型形式推出：响应更迅捷、擅长信息检索的"即时版"；专精编程、数学与规划任务的"思考版"；以及能高精度处理复杂问题的顶级"专业版"。OpenAI宣称GPT-5.2是迄今最适合日常专业场景的模型，其中思考版在GDPval基准测试中创下历史最高分——该测试涵盖44种现实职业，对比人工智能与人类专家的表现。公司数据显示该模型在超70%任务中超越人类专家，且完成任务速度提升11倍。

OpenAI后期训练负责人马克斯·施瓦泽表示，新模型将大幅降低幻觉现象。据公司测试，在事实类问题回答基准评估中，GPT-5.2思考版的幻觉率较GPT-5.1降低38%。该系列模型将通过API向开发者和ChatGPT用户开放，OpenAI称其"在日常与高阶应用场景均带来显著提升"。

尽管GPT-5.2的基准测试成绩亮眼，但分数远非模型全貌。今年初GPT-5发布时，用户曾抗议其回答过于机械冰冷——这种特质难以通过基准测试衡量，最终公司不得不在发布数日后紧急推出"温情化"更新。OpenAI始终面临的核心矛盾在于：既要提升ChatGPT对话趣味性以促进使用，又要避免模型陷入过度谄媚的"讨好型"应答模式。

过去一年，OpenAI处理了ChatGPT使用中涌现的各类心理健康问题。去年10月的报告显示，每周有超百万人向ChatGPT倾诉自杀倾向。同月，负责心理健康项目的研究负责人宣布离职。但面对谷歌和Meta的竞争压力，OpenAI仍需全力拓展用户规模。据《纽约时报》报道，ChatGPT负责人尼克·特利去年10月曾在内部备忘录中警示公司正面临"史上最强竞争压力"，并设定2026年前将日活跃用户提升5%的目标。

OpenAI表示，GPT-5.2持续强化了对自我伤害倾向、心理困扰及情感依赖等敏感提示的应对机制。公司已在部分国家启动早前公布的年龄预测模型试点，该系统将自动为预估未满18岁的用户启动内容保护。西莫透露公司计划于2026年第一季度推出"成人模式"，奥尔特曼此前曾表示该模式将允许成年用户与ChatGPT进行"情色对话"。

英文来源：

OpenAI has introduced GPT-5.2, its smartest artificial intelligence model yet, with performance gains across writing, coding, and reasoning benchmarks. The launch comes just days after CEO Sam Altman internally declared a “code red,” a company-wide push to improve ChatGPT amid intense competition from rivals.
“We announced this code red to really signal to the company that we want to marshal resources in one particular area, and that's a way to really define priorities,” said OpenAI’s CEO of applications, Fidji Simo, in a briefing with reporters on Thursday. “We have had an increase in resources focused on ChatGPT in general.”
Simo denied that OpenAI had moved up GPT-5.2’s launch in light of its code red, claiming the company has been working on this model’s release for months. However, she said the additional resources around ChatGPT have been “helpful.”
While OpenAI’s models and products were considered best-in-class when ChatGPT launched in 2022, that’s no longer a settled matter. The startup now faces an array of worthy challengers, perhaps none more threatening than Google, whose recently launched Gemini 3 model was received well by the tech industry. Google’s Gemini app has grown at an impressive rate over the last year, now with more than 650 million monthly active users, compared to OpenAI’s 800 million weekly active users. That pressure has forced OpenAI to rein in some of its most ambitious projects, including its work on introducing ads to ChatGPT, and to refocus on improving its core technology and products.
Much like the company’s recent model launches, GPT-5.2 is shipping as a series of models: Instant, which responds faster and is better for information-finding; Thinking, which excels at coding, math, and planning; and Pro, the most powerful tier of OpenAI’s models that delivers higher accuracy on difficult questions.
OpenAI calls GPT-5.2 its best model yet for everyday professional use. GPT-5.2 Thinking notched the highest scores to date on GDPval, an OpenAI benchmark that compares performance between AI models and human professionals across 44 real-world occupations. The company says the model beat human professionals in over 70 percent of tasks, and completed them 11 times faster.
OpenAI’s post-training lead Max Schwarzer says the new release should also offer a substantial reduction in hallucinations. The company says GPT-5.2 Thinking hallucinated 38 percent less than GPT-5.1 on benchmarks measuring answers to factual questions.
The company is bringing GPT-5.2 to both ChatGPT users and developers on OpenAI’s API product. OpenAI says the new series of models “brings clear gains across everyday and advanced use cases.”
While GPT-5.2’s performance looks impressive on paper, benchmark scores only tell part of the story for any model launch. When OpenAI released GPT-5 earlier this year, users revolted over the model’s colder responses, a trait that’s difficult to measure through benchmarks alone. The company ended up releasing an update to GPT-5 days after the launch to make the model “warmer.”
A key tension around OpenAI’s model launches is making ChatGPT more enjoyable to chat with in order to drive up usage, without making the model overly sycophantic—the tendency for an AI model to be excessively agreeable. Over the last year, OpenAI has navigated a wide array of mental health challenges associated with ChatGPT usage. In October, the company released a report that found more than a million people talk to ChatGPT about suicide every week. That same month, a research leader behind the company’s mental health work internally announced her plans to leave OpenAI.
But in light of competitive pressures from Google and Meta, OpenAI has significant incentives to grow ChatGPT’s user base. In October, OpenAI’s head of ChatGPT, Nick Turley, sent a memo to the company declaring it was facing “the greatest competitive pressure we’ve ever seen,” according to The New York Times. To combat those pressures, Turley reportedly set a goal to increase daily active users by 5 percent before 2026.
With GPT-5.2, OpenAI says it has continued to strengthen ChatGPT’s responses to sensitive prompts indicating signs of self harm, mental health distress, or emotional reliance on a model. The company also says it’s in the early stages of rolling out its previously announced age-prediction model in certain countries. This system will allow the company to automatically apply content protections for users whom it estimates are under 18.
Simo says the company now plans to roll out its “adult mode” in the first quarter of 2026, which Altman previously indicated would allow users over 18 to have “erotic” conversations with ChatGPT.

连线杂志AI最前沿

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读