OpenAI正要求承包商上传过往工作内容，以评估人工智能代理的表现。

qimuai 发布于 2026-1-11 08:01 阅读：30 一手编译

内容来源：https://www.wired.com/story/openai-contractor-upload-real-work-documents-ai-agents/

内容总结：

据《连线》杂志获取的内部文件显示，OpenAI正通过合作机构招募第三方承包商，要求其上传当前或过往职场中的真实工作任务案例，用于评估新一代人工智能模型的表现。该项目旨在建立各行业人类工作表现的基准线，以对比衡量AI模型在专业任务中的能力。

OpenAI在机密文件中表示，已聘请来自不同职业的人员协助收集基于全职工作场景的真实任务，从而评估AI模型的实际表现。承包商需提交曾处理过的长期或复杂任务案例，并提供原始工作文件（如Word文档、PPT演示稿、Excel表格等），同时被允许提交为模拟场景创作的虚构案例。

该公司在指导材料中多次强调，提交案例必须是“真实完成过的在职工作”，包含任务要求与最终成果两部分。示例中展示了一名豪华礼宾公司高管为客户制定的巴哈马游艇行程规划案例。OpenAI要求承包商删除涉及企业知识产权、个人隐私及商业机密的信息，并提及使用名为“超级擦除”的ChatGPT工具协助清理敏感内容。

知识产权律师埃文·布朗指出，AI公司大规模接收承包商提供的职场文件可能面临商业秘密侵权风险。即使经过信息脱敏处理，承包商仍可能违反与前雇主的保密协议，而AI公司对内容是否涉密的判断依赖承包商自主审查，存在法律隐患。

这一举措揭示了AI实验室提升模型实战能力的新策略。随着对高质量训练数据需求激增，OpenAI、Anthropic等公司正通过专业承包商网络获取企业级任务数据，催生了估值数十亿美元的AI训练细分产业。据悉，OpenAI曾探索从破产企业收购脱敏内部数据的可能性，但因无法确保完全清除个人信息而搁置该计划。目前OpenAI及其合作的数据训练公司Handshake AI均未对此事发表评论。

中文翻译：

根据《连线》杂志获取的OpenAI及培训数据公司Handshake AI内部文件显示，OpenAI正要求第三方承包商上传他们当前或过往职场中的真实工作任务，以利用这些数据评估其下一代人工智能模型的性能。

该项目似乎是OpenAI为各类任务建立人类基准线计划的一部分，旨在将人工智能模型的表现与人类专业水平进行对比。今年9月，该公司启动了全新评估流程，旨在衡量其人工智能模型在多个行业领域与人类专业人士的表现差异。OpenAI表示，这是衡量其迈向通用人工智能（AGI）进展的关键指标——所谓通用人工智能，即在大多数具有经济价值的工作中超越人类能力的人工智能系统。

"我们聘请了来自不同职业领域的人员，协助收集基于全职工作经验的真实任务，从而评估人工智能模型执行这些任务的能力，"一份OpenAI内部机密文件写道，"请选取您职业经历中长期或复杂的实际工作成果（耗时数小时至数天），将每项成果转化为具体任务。"

根据《连线》查阅的OpenAI项目演示文件，该公司要求承包商描述当前或过往工作中的具体任务，并上传真实的工作成果样本。演示文件特别说明，每个样本应是"具体产出文件（而非文件摘要），例如Word文档、PDF、PPT、Excel表格、图像或代码库"。OpenAI表示参与者也可分享为演示特定场景应对方式而虚构的工作案例。

OpenAI与Handshake AI均拒绝就此置评。

根据OpenAI演示文件说明，真实世界任务包含两个组成部分：任务要求（来自上级或同事的指令）和任务交付成果（根据指令完成的具体工作）。该公司在指导文件中多次强调，承包商分享的案例必须反映"本人实际完成的真实在职工作"。

演示文件中的示例展示了一位"超高净值人群豪华礼宾公司高级生活方式经理"的任务：目标是为首次前往巴哈马旅行的家庭"起草一份2页的7日游艇旅行概述PDF文件"，其中包含该家庭兴趣偏好及行程规划的具体要求。随后展示的"人类专家交付成果"正是承包商为此案例上传的真实客户巴哈马行程方案。

OpenAI要求承包商在上传工作文件前删除企业知识产权和个人身份信息。在标注"重要提醒"的章节中，该公司明确要求工作人员"移除或匿名处理所有：个人信息、专有或机密数据、非公开重大信息（如内部战略、未发布产品细节）"。

《连线》查阅的文件中提到一款名为"超级擦除"的ChatGPT工具，该工具可提供删除机密信息的操作建议。

Neal & McDevitt律师事务所知识产权律师埃文·布朗向《连线》指出，如此大规模接收承包商机密信息的人工智能实验室可能面临商业秘密盗用指控。承包商即使经过脱敏处理，向人工智能公司提供前雇主文件的行为仍可能违反保密协议或泄露商业秘密。

"人工智能实验室过度依赖承包商自行判断信息的机密性，"布朗表示，"若出现信息泄露，这些实验室是否真正投入精力甄别商业秘密？在我看来，人工智能实验室正在将自己置于巨大风险之中。"

这些文件揭示了人工智能实验室为提升模型现实任务处理能力所采取的策略。OpenAI、Anthropic和谷歌等公司正大量雇佣能够生成高质量训练数据的承包商，以开发能够实现企业工作自动化的人工智能代理。

长期以来，人工智能实验室依赖Surge、Mercor和Scale AI等第三方承包公司招募管理数据承包商网络。但近年来，为提升模型性能，人工智能实验室对数据质量要求不断提高，迫使它们以更高薪酬聘请能产出优质数据的专业人才，这在人工智能培训领域催生出一个利润丰厚的细分产业。Handshake公司称其2022年估值达35亿美元，而Surge据称在去年夏季融资谈判中估值已达250亿美元。

OpenAI似乎还探索过获取真实企业数据的其他途径。一位协助破产企业处置资产的专业人士透露，OpenAI代表曾咨询获取此类企业数据的可能性，前提是能够去除个人身份信息。这位因担心影响商业关系而要求匿名的人士表示，相关数据可能包含文档、电子邮件及其他内部通讯记录。该人士最终放弃了这个合作构想，主要原因是对个人信息能否完全清除缺乏信心。

英文来源：

OpenAI is asking third-party contractors to upload real assignments and tasks from their current or previous workplaces so that it can use the data to evaluate the performance of its next-generation AI models, according to records from OpenAI and the training data company Handshake AI obtained by WIRED.
The project appears to be part of OpenAI’s efforts to establish a human baseline for different tasks that can then be compared with AI models. In September, the company launched a new evaluation process to measure the performance of its AI models against human professionals across a variety of industries. OpenAI says this is a key indicator of its progress towards achieving AGI, or an AI system that outperforms humans at most economically valuable tasks.
“We’ve hired folks across occupations to help collect real-world tasks modeled off those you’ve done in your full-time jobs, so we can measure how well AI models perform on those tasks,” reads one confidential document from OpenAI. “Take existing pieces of long-term or complex work (hours or days+) that you’ve done in your occupation and turn each into a task."
OpenAI is asking contractors to describe tasks they’ve done in their current job or in the past and to upload real examples of work they did, according to an OpenAI presentation about the project viewed by WIRED. Each of the examples should be “a concrete output (not a summary of the file, but the actual file), e.g., Word doc, PDF, Powerpoint, Excel, image, repo,” the presentation notes. OpenAI says people can also share fabricated work examples created to demonstrate how they would realistically respond in specific scenarios.
OpenAI and Handshake AI declined to comment.
Real-world tasks have two components, according to the OpenAI presentation. There’s the task request (what a person’s manager or colleague told them to do) and the task deliverable (the actual work they produced in response to that request). The company emphasizes multiple times in instructions that the examples contractors share should reflect “real, on-the-job work” that the person has “actually done.”
One example in the OpenAI presentation outlines a task from a “Senior Lifestyle Manager at a luxury concierge company for ultra-high-net-worth individuals.” The goal is to “prepare a short, 2-page PDF draft of a 7-day yacht trip overview to the Bahamas for a family who will be traveling there for the first time.” It includes additional details regarding the family’s interests and what the itinerary should look like. The “experienced human deliverable” then shows what the contractor in this case would upload: a real Bahamas itinerary created for a client.
OpenAI instructs the contractors to delete corporate intellectual property and personally identifiable information from the work files they upload. Under a section labeled “Important reminders,” OpenAI tells the workers to “remove or anonymize any: personal information, proprietary or confidential data, material nonpublic information (e.g., internal strategy, unreleased product details).”
One of the files viewed by WIRED document mentions a ChatGPT tool called “Superstar Scrubbing” that provides advice on how to delete confidential information.
Evan Brown, an intellectual property lawyer with Neal & McDevitt, tells WIRED that AI labs that receive confidential information from contractors at this scale could be subject to trade secret misappropriation claims. Contractors who offer documents from their previous workplaces to an AI company, even scrubbed, could be at risk of violating their previous employers’ nondisclosure agreements or exposing trade secrets.
“The AI lab is putting a lot of trust in its contractors to decide what is and isn’t confidential,” says Brown. “If they do let something slip through, are the AI labs really taking the time to determine what is and isn’t a trade secret? It seems to me that the AI lab is putting itself at great risk.”
The documents reveal one strategy AI labs are using to prepare their models to excel at real world tasks. Firms like OpenAI, Anthropic, and Google are hiring armies of contractors who can generate high-quality training data in order to develop AI agents capable of automating enterprise work.
AI labs have long relied on third-party contracting firms such as Surge, Mercor, and Scale AI to hire and manage networks of data contractors. In recent years, however, AI labs have required higher-quality data in order to improve their models, forcing them to pay more for skilled talent capable of producing it. That has created a lucrative sub-industry within the AI training world. Handshake said it was valued at $3.5 billion in 2022, while Surge reportedly valued itself at $25 billion in fundraising talks last summer.
OpenAI appears to have explored other ways of sourcing real company data. An individual who helps companies sell assets after they go out of business told WIRED that a representative of OpenAI inquired about obtaining data from these firms, providing that personally identifiable information could be removed. The source, who spoke to WIRED on condition of anonymity because they did not want to sour any business relationships, said the data would have included documents, emails, and other internal communications. The source said they chose not to pursue the idea because they were not confident that personal information could be completely scrubbed.

连线杂志AI最前沿

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读