美国国防部官员透露,五角大楼正计划允许人工智能企业在涉密数据上进行模型训练。

内容总结:
五角大楼拟推动AI公司利用涉密数据训练军事模型,专家警示安全风险
据《麻省理工科技评论》获悉,美国国防部正计划为生成式人工智能公司建立安全环境,使其能够利用美国政府的机密数据,训练专用于军事任务的AI模型。此举标志着美军在推进“AI优先”作战力量建设上迈出关键一步,但也引发了新的安全担忧。
目前,如Anthropic公司的Claude等AI模型已在机密环境中用于回答问题,包括分析伊朗目标等任务。然而,允许模型直接在涉密数据上学习训练,仍属全新举措。一位美国国防部官员表示,此举有望使AI模型在特定任务中更加精准有效。
根据计划,训练将在经过安全认证、可托管政府涉密项目的数据中心内进行。尽管国防部将始终保留数据所有权,但在极少数情况下,获得相应安全许可的AI公司人员可能接触数据。官员强调,在启动涉密数据训练前,五角大楼将首先评估模型在非密数据(如商用卫星图像)上训练的效果。
分析指出,若AI模型直接内嵌如监视报告或战场评估等敏感情报,将带来独特风险。美国战略与国际研究中心瓦德瓦尼人工智能中心主任阿洛克·梅塔警告,最大的风险在于模型可能将训练中学到的涉密信息泄露给无权访问的军方其他部门用户,例如泄露特工姓名,从而构成难以完全规避的安全隐患。
不过梅塔也认为,若设置得当,数据泄露至公共互联网或回流至AI公司的风险较低。目前,安全巨头Palantir已获得合同,构建了能让官员安全询问AI模型涉密话题的隔离环境,但将其用于训练仍是一项新挑战。
当前,美军正加速将AI融入从目标排序、打击建议到合同起草等各类任务。梅塔指出,许多目前由人工分析员完成的任务(如从图像中识别细微线索、关联新信息与历史背景)未来可能交由AI执行,这需要模型学习海量、多语言的涉密文本、音频、图像和视频数据。然而,具体哪些军事任务需要此类训练仍属高度机密。
随着与伊朗的紧张局势升级,五角大楼对强大AI模型的需求高涨,已与OpenAI及埃隆·马斯克的xAI等公司达成协议,在机密环境中运行其模型。此次涉密数据训练计划的披露,预示着美军与AI科技公司的合作正进入更深入、更敏感的领域。
中文翻译:
据一位美国国防部官员透露,五角大楼正计划让人工智能公司在涉密数据上训练模型。目前用于涉密环境的生成式人工智能模型虽能回答问题,但尚无法从所接触数据中持续学习——这一现状可能即将改变。
《麻省理工科技评论》获悉,五角大楼正在讨论建立安全环境,供生成式人工智能公司利用涉密数据训练军用定制版模型。类似Anthropic公司Claude的人工智能模型已应用于涉密场景答疑,包括分析伊朗境内目标。但允许模型通过涉密数据进行训练学习将开启全新阶段,同时带来独特的安全风险。这意味着监控报告或战场评估等敏感情报将内嵌至模型本身,也使人工智能公司比以往更密切接触涉密数据。
一位以匿名背景接受《麻省理工科技评论》采访的美国国防部官员表示,基于涉密数据训练的人工智能模型有望在特定任务中提升准确性与有效性。这一动向正值对更强大模型需求高涨之际:随着与伊朗冲突升级,五角大楼已与OpenAI及埃隆·马斯克的xAI达成协议,在涉密环境中运行其模型,并正推行新议程以打造"人工智能优先的作战力量"。(截至发稿时,五角大楼未就其人工智能训练计划置评。)
据两位熟悉此类操作流程的人士透露,训练将在获准承接政府涉密项目认证的安全数据中心进行,人工智能模型副本将在该环境中与涉密数据配对。该官员指出,尽管国防部将保留数据所有权,但在极少数情况下,获得相应安全许可的人工智能公司人员可能接触数据。
该官员同时表示,在批准此类新型训练前,五角大楼计划先评估模型基于非涉密数据(如商用卫星图像)训练后的准确性与有效性。
美军长期使用计算机视觉模型(一种早期人工智能形式)识别无人机和飞机采集图像与影像中的物体,联邦机构也已授权企业基于此类内容训练人工智能模型。开发大语言模型与聊天机器人的人工智能公司则推出了政府工作定制版模型,例如Anthropic的Claude Gov,其设计可在更多语言环境及安全系统中运行。但该官员的言论首次表明,开发大语言模型的人工智能公司(如OpenAI与xAI)可能直接基于涉密数据训练政府专用版模型。
战略与国际研究中心瓦德瓦尼人工智能中心主任、曾任谷歌与OpenAI人工智能政策负责人的阿洛克·梅赫塔指出,基于涉密数据进行训练(而非仅回答相关问题)将带来新风险。
他表示,最突出的风险在于模型训练所涉密信息可能被任何使用该模型的人员重新调取。若众多拥有不同密级和信息需求的军事部门共享同一人工智能系统,这将引发严重问题。
"例如可以设想,一个掌握某类敏感人力情报(如特工姓名)的模型,可能将该信息泄露给国防部内无权接触该情报的部门,"梅赫塔解释道。这可能危及特工安全,若同一模型被军队内部多个群体使用,此类风险将难以完全规避。
不过梅赫塔认为,防止信息向外界扩散的难度相对较低:"如果设置得当,数据在公共互联网泄露或回流至OpenAI的风险极低。"政府已具备部分相关基础设施;安全巨头Palantir已获得巨额合同,用于构建安全环境,使官员能就涉密话题咨询人工智能模型而无需将信息传回人工智能公司。但将这些系统用于训练仍是全新挑战。
受国防部长皮特·赫格塞斯一月备忘录推动,五角大楼正加速融合更多人工智能技术,既涵盖作战领域(如生成式人工智能生成目标排序清单并建议优先打击对象),也涉及行政职能(如起草合同与报告)。
梅赫塔表示,军方可能希望训练前沿人工智能模型执行许多目前由人工分析员承担的任务,这需要接触涉密数据,包括学习像分析员那样识别图像中的细微线索,或将新信息与历史背景关联。涉密数据可取自情报机构收集的海量多语言文本、音频、图像及视频资料。
梅赫塔谨慎指出,很难具体说明哪些军事任务需要人工智能模型基于此类数据训练,"因为国防部显然有充分动机对此类信息保密,他们不希望其他国家确切了解我们在该领域的具体能力。"
深度解析
人工智能
"退出GPT"运动呼吁用户取消ChatGPT订阅
对ICE的抵制正推动更广泛运动,反对人工智能公司与特朗普总统的关联。
Moltbook成为人工智能作秀巅峰
这款病毒式传播的机器人社交网络,既揭示了智能体未来,更折射出当前社会对人工智能的狂热。
杨立昆的新项目是对大语言模型的逆向押注
这位人工智能先驱在独家访谈中分享了其巴黎新公司AMI Labs的发展规划。
《宝可梦GO》如何为配送机器人提供厘米级精度的世界视图
独家报道:Niantic的人工智能子公司正利用玩家众包的300亿张城市地标图像训练全新世界模型。
保持联系
获取《麻省理工科技评论》最新动态
探索特别优惠、头条新闻、 upcoming events等更多内容。
英文来源:
The Pentagon is planning for AI companies to train on classified data, defense official says
The generative AI models used in classified environments can answer questions, but don't currently learn from the data they see. That could soon change.
The Pentagon is discussing plans to set up secure environments for generative AI companies to train military-specific versions of their models on classified data, MIT Technology Review has learned.
AI models like Anthropic's Claude are already used to answer questions in classified settings, including for analyzing targets in Iran. But allowing models to train on and learn from classified data would be a new development that presents unique security risks. It would mean sensitive intelligence like surveillance reports or battlefield assessments become embedded into the models themselves, and bring AI firms into closer contact with classified data than before.
Training versions of AI models on classified data is expected to make them more accurate and effective in certain tasks, according to a US defense official who spoke on background with MIT Technology Review. The news comes as demand for more powerful models is high: the Pentagon has reached agreements with OpenAI and Elon Musk's xAI to operate their models in classified settings, and is implementing a new agenda to become an "an 'AI-first' warfighting force" as the conflict with Iran escalates. (The Pentagon did not comment on its AI training plans as of publication time.)
Training would be done in a secure data center that's accredited to host classified government projects, and where a copy of an AI model is paired with classified data, according to two people familiar with how such operations work. Though the Department of Defense would remain the owner of the data, personnel from AI companies with appropriate security clearances might in rare cases access the data, the official said.
Before allowing this new training, though, the official said the Pentagon intends to first evaluate how accurate and effective models are when trained on non-classified data, like commercially available satellite imagery.
The military has long used computer vision models, an older form of AI, to identify objects in images and footage it collects from drones and airplanes, and federal agencies have awarded contracts to companies to train AI models on such content. And AI companies building large language models (LLMs) and chatbots have created versions of their models fine-tuned for government work, like Anthropic's Claude Gov, which are designed to operate across more languages and in secure environments. But the official's comments are the first indication that AI companies building LLMs, like OpenAI and xAI, could train government-specific versions of their models directly on classified data.
Aalok Mehta, who directs the Wadhwani AI Center at the Center for Strategic and International Studies and previously led AI policy efforts at Google and OpenAI, says training on classified data, as opposed to just answering questions about it, would present new risks.
The biggest of these, he says, is the fact that classified information these models train on could be resurfaced to anyone using the model. That would be a problem if lots of different military departments, all with different classification levels and needs for information, were to share the same AI.
"You can imagine, for example, a model that has access to some sort of sensitive human intelligence—like the name of an operative—leaking that information to a part of the Defense Department that isn't supposed to have access to that information," Mehta says. That could create a security risk for the operative, one that's difficult to perfectly mitigate if a particular model is used by more than one group within the military.
However, Mehta says, it’s not as hard to keep information contained from the broader world: "If you set this up right, you will have very little risk of that data being surfaced on the general internet or back to OpenAI." The government has some of the infrastructure for this already; the security giant Palantir has won sizable contracts for building a secure environment through which officials can ask AI models about classified topics without sending the information back to AI companies. But using these systems for training is still a new challenge.
The Pentagon, spurred by a memo from Defense Secretary Pete Hegseth in January, has been racing to incorporate more AI. This has included in combat, like generative AI ranking lists of targets and recommending which to strike first, and in more administrative roles, like drafting contracts and reports.
There are lots of tasks currently handled by human analysts that the military might want to train leading AI models to perform and would require access to classified data, Mehta says. That could include learning to identify subtle clues in an image the way an analyst does, or connecting new information with historical context. The classified data could be pulled from the unfathomable amounts of text, audio, images, and video in many languages collected by intelligence services.
It's really hard to say which specific military tasks would require AI models to train on such data, Mehta cautions, "because obviously the Defense Department has lots of incentives to keep that information confidential, and they don't want other countries to know what kind of capabilities we have exactly in that space."
Deep Dive
Artificial intelligence
A “QuitGPT” campaign is urging people to cancel their ChatGPT subscriptions
Backlash against ICE is fueling a broader movement against AI companies’ ties to President Trump.
Moltbook was peak AI theater
The viral social network for bots reveals more about our own current mania for AI as it does about the future of agents.
Yann LeCun’s new venture is a contrarian bet against large language models
In an exclusive interview, the AI pioneer shares his plans for his new Paris-based company, AMI Labs.
How Pokémon Go is giving delivery robots an inch-perfect view of the world
Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.
文章标题:美国国防部官员透露,五角大楼正计划允许人工智能企业在涉密数据上进行模型训练。
文章链接:https://qimuai.cn/?post=3598
本站文章均为原创,未经授权请勿用于任何商业用途