«

DeepMind推出Gemini机器人1.5版:赋予机器人推理能力

qimuai 发布于 阅读:2 一手编译


DeepMind推出Gemini机器人1.5版:赋予机器人推理能力

内容来源:https://aibusiness.com/robotics/deepmind-gemini-robotics-reasoning-powers

内容总结:

谷歌发布新一代机器人模型Gemini Robotics 1.5,自主任务规划能力显著提升

谷歌旗下人工智能研究机构DeepMind近日宣布,其新一代视觉-语言-动作模型Gemini Robotics 1.5已正式亮相。该升级旨在显著增强机器人的环境感知与“思考”能力,使其能够以更高自主性规划和执行复杂任务。

此次发布包含两个互补系统:负责将视觉输入和指令转换为运动命令的Gemini Robotics 1.5,以及具备“具身推理”能力的Gemini Robotics-ER 1.5。后者能够利用网络搜索等数字工具预先规划任务,再交由前者执行。这种协作模式使机器人实现了“先思考后行动”,不仅能解释决策逻辑,还能适应情境化工作,例如按颜色分类衣物或根据天气情况整理行李箱。

DeepMind在9月25日的官方博文中指出,这一突破标志着向通用人工智能(AGI)迈出了“基础性一步”。新系统超越了仅能响应指令的模型,创造了能够真正进行推理、规划、主动使用工具并举一反三的智能体。

该升级的另一核心突破是实现了跨机器人平台的技能无缝迁移。在测试中,双臂机器人ALOHA2习得的任务可直接移植至Franka双臂机器人乃至Apptronik的人形机器人Apollo上,无需重新训练。这一特性将大幅加速机器人学习新行为的过程,推动其向更智能、更实用的方向发展。

目前,Gemini Robotics-ER 1.5将通过谷歌AI Studio中的Gemini API向开发者开放,而Gemini Robotics 1.5仅限部分合作伙伴使用。

中文翻译:

谷歌云赞助
如何选择首个生成式AI应用场景
开展生成式AI应用时,应首先聚焦于能优化人类信息交互体验的领域。
谷歌人工智能研究部门表示,新模型标志着向通用人工智能又迈进了一步。
谷歌DeepMind发布Gemini Robotics 1.5,这是其视觉-语言-动作模型系列的最新版本。
谷歌称此次升级旨在提升机器人的感知与"思考"能力,使其能以更高自主性规划并执行复杂任务。
该版本包含两个互补系统:可将视觉输入指令转化为运动指令的Gemini Robotics 1.5,以及运用网络搜索等数字工具进行任务规划的"具身推理"模型Gemini Robotics-ER 1.5。二者协同实现"先思考后行动"的运作模式。
这套系统能让机器人在行动前进行决策阐释,并根据情境调整工作方式——例如按颜色分类洗衣物、依据天气状况整理行李箱。
在9月25日的发布博文中,DeepMind称Gemini 1.5是通向通用人工智能的"基础性突破"。
"Gemini Robotics 1.5标志着解决物理世界通用人工智能的重要里程碑,"该公司表示,"通过引入智能体能力,我们正从被动响应指令的模式,迈向能真正推理、规划、主动使用工具并实现泛化的系统。"
升级的另一关键特性是机器人跨系统形态的技能共享能力。
测试中,双ALOHA2机器人习得的任务可直接迁移至Franka双臂机器人,甚至Apptronik的人形机器人Apollo,无需重新训练。
ALOHA2是DeepMind与斯坦福大学研究人员共同开发的开源软硬件项目。Franka机器人是德国Agile Robots AG的项目。总部位于奥斯汀的Apptronik正与埃隆·马斯克的特斯拉Optimus人形机器人项目展开竞争。
DeepMind表示:"这一突破加速了新行为的学习进程,助力机器人变得更智能、更实用。"
谷歌将通过Google AI Studio中的Gemini API向开发者开放Gemini Robotics-ER 1.5,而Gemini Robotics 1.5仅限特定合作伙伴使用。
您可能还喜欢

英文来源:

Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Google's AI research unit said the new models mark a step closer to artificial general intelligence.
Google DeepMind unveiled Gemini Robotics 1.5, the latest iteration in its line of vision-language-action models.
According to Google, the upgrade is designed to give robots greater perception and "thinking" capabilities, enabling them to plan and conduct complex tasks with greater autonomy than before.
The release includes two complementary systems: Gemini Robotics 1.5, which translates visual input and instructions into motor commands, and Gemini Robotics-ER 1.5, an "embodied reasoning" model that uses digital tools such as web search to plan tasks before handing execution over to its counterpart.
Together, the models allow robots to "think" before they act, explaining their decision-making and adapting to context-dependent jobs -- such as separating laundry by color or packing a suitcase depending on the weather.
In a blog post about the launch on Sept. 25, DeepMind said Gemini 1.5 marks a "foundational step" toward artificial general intelligence (AGI).
"Gemini Robotics 1.5 marks an important milestone toward solving AGI in the physical world," the company said. "By introducing agentic capabilities, we’re moving beyond models that react to commands and creating systems that can truly reason, plan, actively use tools and generalize."
Another key feature of the upgrade is the ability for robots to share skills across different systems and forms.
In tests, Google DeepMind found that a task learned by the dual-arm ALOHA2 robot could be directly transferred to the Franka bi-arm robot and even to Apptronik's humanoid Apollo robot, without retraining.
ALOHA2 is an open source hardware and software project developed collaboratively by DeepMind and Stanford University researchers. The Franka robot is a project of Germany-based Agile Robots AG. Austin-based Apptronik is challenging Elon Musk's Tesla Optimus humanoid robot project.
"This breakthrough accelerates learning new behaviors, helping robots become smarter and more useful," DeepMind said.
Google will roll out Gemini Robotics-ER 1.5 to developers through the Gemini API in Google AI Studio, though only select partners will have access to Gemini Robotics 1.5.
You May Also Like

商业视角看AI

文章目录


    扫描二维码,在手机上阅读