«

一场会议刚刚测试了人工智能代理进行科学研究的能力。

qimuai 发布于 阅读:12 一手编译


一场会议刚刚测试了人工智能代理进行科学研究的能力。

内容来源:https://www.sciencenews.org/article/science-conference-test-ai-agents

内容总结:

近日,全球首届以"AI智能体"为核心研究者的"Agents4Science 2025"虚拟学术会议引发学界关注。这场于10月22日举行的特殊会议要求所有提交论文必须由人工智能主导完成,人类仅作为协作伙伴参与研究全过程。

本次会议共收到314篇投稿,最终48篇通过评审。参选论文需详细披露人类与AI在研究各环节的具体分工——从提出假设、数据分析到完成首轮同行评议,均由AI智能体(即大型语言模型与专业工具结合的智能系统)主导,人类专家随后对优质论文进行终审。

斯坦福大学计算机科学家、会议联合组织者James Zou指出,此举旨在探索AI的科研能力边界。目前多数科学期刊和会议仍禁止AI参与研究,但这种保守态度反而阻碍了学界对AI科研能力的客观评估。

会上展示的AI辅助研究涵盖经济学、生物学等多个领域。加州大学伯克利分校经济学家Min Min Fong团队与AI合作分析旧金山汽车拖吊数据时发现,AI能显著提升计算效率,但也存在关键信息失实问题。"必须时刻保持警惕,"Fong强调,"核心科研工作仍需要人类主导。"

评审专家、斯坦福计算天体物理学家Risa Wechsler表示,现有AI产出的论文虽技术准确,但普遍缺乏科学价值。她认为AI的技术能力可能"掩盖其科学判断力的不足"。

不过会议也呈现了积极信号。来自旧金山Upwork公司的机器学习工程师Silvia Terragni透露,由其与ChatGPT共同构思的论文荣获会议最佳论文奖,该研究探讨了求职市场中AI推理技术的应用。"AI确实能产生创新思路,"她对此表示肯定。

这场实验性会议的所有研究材料均已公开,为学界评估AI科研潜力提供了重要参考依据。

中文翻译:

一场会议首次测试了人工智能从事科研的能力。与会者坦言:"与AI共事必须格外谨慎。"这场名为"Agents4Science 2025"的虚拟会议于10月22日举行,首次面向所有科学领域开放论文投稿,但有个特殊要求:研究必须主要由AI完成。会议聚焦人工智能代理的工作机制——这类系统将大语言模型与其他工具或数据库结合,能执行多步骤复杂任务。

从提出假设、分析数据到完成首轮同行评审,AI代理始终扮演主导角色。人类评审随后介入,对优质投稿进行终审。在314篇投稿中,最终48篇通过遴选。每篇入选论文都需详细阐述人类与AI在研究写作各环节的具体协作方式。

会议联合主席、斯坦福大学计算机科学家James Zou指出:"我们正见证科研范式的变革。人们开始尝试将AI打造成科研伙伴。"目前多数科学期刊和会议仍禁止AI署名,也反对审稿人依赖AI。这些规定旨在规避AI可能产生的幻觉问题及其他风险。但Zou表示,这种保守态度使得人们难以评估AI的科研能力,而这正是本次会议试图探索的方向。他将其称为一场公开实验,所有材料均对外开放供研究使用。

在这场虚拟会议上,研究者展示了AI辅助完成的经済学、生物学和工程学等领域成果。加州大学伯克利分校经济学家Min Min Fong团队与AI合作分析旧金山汽车拖吊数据时发现,减免高额拖车费有助于低收入群体保留车辆。Fong评价:"AI在加速计算方面表现卓越",但随即强调"与AI共事必须格外谨慎"——系统持续错误引用旧金山拖车费减免条例的实施日期,迫使她反复核查原始资料。她指出:"核心科研工作仍需要人类主导。"

担任会议评审的斯坦福计算天体物理学家Risa Wechsler认为结果喜忧参半。她审阅的论文在技术上准确无误,"但既无趣又缺乏重要性"。尽管对AI的科研潜力感到振奋,她仍质疑当前AI代理是否具备"设计严谨科学问题"的能力,并警告"技术娴熟可能掩盖科学判断力的缺陷"。

不过会议仍展现了AI科研的曙光。旧金山Upwork公司的机器学习工程师Silvia Terragni透露,她向ChatGPT说明公司待解决的问题类型后,AI提出的某个创意最终荣获会议最佳论文前三名——这项关于求职市场中AI推理应用的研究让她确信:"AI确实能产生创新思路"。

英文来源:

A conference just tested AI agents’ ability to do science
“You have to be really careful when working with AI,” says one participant
In a first, a scientific conference welcomed paper submissions from any area of science, but with one catch: AI had to do most of the work. Called Agents4Science 2025, the Oct. 22 virtual event focused on the work of artificial intelligence agents — systems that pair large language models with other tools or databases to perform multistep tasks.
From formulating hypotheses to analyzing data and providing the first round of peer reviews, AI agents took the lead. Human reviewers then stepped in to assess the top submissions. In all, 48 papers out of 314 made the cut. Each had to detail how people and AI collaborated on every stage of the research and writing process.
“We’re seeing this interesting paradigm shift,” said James Zou, a computer scientist at Stanford University who co-organized the conference. “People are starting to explore using AI as a co-scientist.”
Most scientific journals and meetings currently ban AI coauthors and prohibit peer reviewers from relying on AI. These policies aim to avoid hallucinations and other issues related to AI use. However, this approach makes it tough to learn how good AI is at science. That’s what Agents4Science aimed to explore, Zou said, calling the conference an experiment, with all the materials publicly available for anyone to study.
At the virtual meeting, humans presented AI-assisted work spanning fields such as economics, biology and engineering. Min Min Fong, an economist at the University of California, Berkeley, and her team collaborated with AI to study car-towing data from San Francisco. Their study found that waiving high towing fees helped low-income people keep their vehicles.
“AI was really great at helping us with computational acceleration,” Fong said. But, she found, “you have to be really careful when working with AI.”
As an example, the AI kept citing the wrong date for when San Francisco’s rule waiving towing fees went into effect. Fong had to check this in the original source to discover the error. “The core scientific work still remains human-driven,” she said.
For Risa Wechsler, a computational astrophysicist at Stanford who helped review submissions, the results were mixed. The papers she saw were technically correct, she said, “but they were neither interesting nor important.” She was excited about the potential of AI for research but remained unconvinced that today’s agents can “design robust scientific questions.” And, she added, the technical skill of AI can “mask poor scientific judgment.”
Still, the event included some glimmers of hope for the future of AI in science. Silvia Terragni, a machine learning engineer at the company Upwork in San Francisco, said that she gave ChatGPT some context about the kinds of problems her company deals with and asked the bot to propose paper ideas. “One of these was the winner,” she said, selected as one of the three top papers in the conference. It was a study about using AI reasoning in a job marketplace. “I think [AI] can actually come up with novel ideas,” she said.

AI科学News

文章目录


    扫描二维码,在手机上阅读