谷歌推出对话式照片编辑器:实用AI功能难得一见
内容来源:https://www.wired.com/story/google-photos-conversational-photo-editor/
内容总结:
【新闻总结】智能手机正成为生成式AI技术落地的重要平台。尽管苹果、谷歌等巨头纷纷推出AI功能,但多数功能实用性有限,未能真正融入日常生活。然而,谷歌最新在Pixel 10等安卓设备上推出的“对话式照片编辑”功能,可能成为改变这一现状的突破点。
该功能内置于谷歌相册中,用户只需通过语音或文字描述编辑需求(如“调亮光线”“去掉背景里的塑料袋”),系统即可自动完成复杂操作,无需手动调整参数。卡内基梅隆大学未来交互研究组主任克里斯·哈里森指出,这一设计降低了技术门槛:“ChatGPT对多数人仍是新奇玩具,但谷歌的编辑工具却能真正普及——就像使用Instagram滤镜一样简单。”
与传统专业软件(如需付费订阅、操作复杂的Photoshop)相比,谷歌的工具通过“对话指令”实现“一键修图”,甚至能完成“添加金刚爬帝国大厦”等创意效果。测试显示,指令如“修复老照片色彩”“扩展画面”均能快速生成效果自然的成片,不过局部精细化调整仍有限制。
针对AI篡改图像的风险,谷歌为编辑后的照片添加了C2PA内容凭证、IPTC元数据等防伪水印,以追踪来源。哈里森认为,图像修饰并非新概念,“若有人认为Instagram呈现的是真实生活,那才是天真。这只是更强大的工具而已。”
业界分析指出,该功能标志着人机交互模式正从“工具式”向“伙伴式”演进——用户无需掌握专业技能,即可通过自然语言调动AI能力。这种“懒人友好型”设计,或许正是AI技术真正走向大众化的关键一步。
中文翻译:
《连线》网站推介的所有产品均由我们的编辑独立筛选。然而,当您通过本文内链接购物时,我们可能获得零售商提供的补偿。了解更多详情。
智能手机已成为各类新型人工智能与生成式AI功能的竞技场。苹果去年重磅推出"Apple Intelligence"套件,内含可凭空生成图片的"Image Playground"、能重写和总结文本的"写作工具"等。在搭载iOS 26系统的最新iPhone 17上,机器智能为通话和信息的实时翻译功能提供支持。谷歌安卓系统也具备诸多类似功能,最新Pixel 10手机甚至能生成用户语音副本用于通话实时翻译。
作为《连线》常驻手机评测专家,我实测过所有这些手机及其宣传的功能。但真正能简化日常生活、甚至让我觉得父母也能轻松上手的实用功能凤毛麟角——这难道不是AI应有的价值吗?
直到我体验了谷歌相册新推出的"对话式修图"功能才改观。该功能随Pixel 10首发,现已适配部分安卓设备。你只需输入或说出修图需求,无需在菜单和滑块间摸索。多数人尚未意识到手机内置软件的强大——此功能不仅让你近乎零门槛地调用所有编辑工具达成目标,更助你重新认识智能手机的潜力。
语音操控构想
用语音指挥电脑完成任务的构想已存在数十年。好莱坞有其独特想象(《2001太空漫游》中的HAL 9000堪称最具标志性的暗黑诠释),而研究人员则持不同见解。Adobe研究院与密歇根大学联合开发的Pixeltone原型应用,曾展示语音结合触控修图的可能性。十二年前有观众在该功能演示视频下留言:"何必苛责?这并非为专业摄影师设计,但对我偶尔用PS的父亲来说太棒了。"
技术普惠的双刃剑
强大修图工具的普及显然存在风险,比如恶意分子可轻易利用其传播虚假信息。但当前多数编辑工具既需用户主动发掘,又要求操作技巧。谷歌的对话编辑器却不同:它功能强大、操作简易、支持自然英语指令,且在谷歌相册中一键即达。
卡内基梅隆大学未来交互研究组主任克里斯·哈里森指出:"对多数人而言,ChatGPT只是新奇玩具,但谷歌新编辑工具将获得更广泛应用——至少所有会用Instagram滤镜的人都能上手。AI本该让生活更轻松,这个真正能引发大众兴趣的案例便是明证。"
人性化设计致胜
明确的功能引导大幅降低了使用门槛。相比以空白文本框示人的AI聊天界面,在谷歌相册点击编辑键即触发对话窗口的设计极具巧思——当你已进入修图情境时,功能自然触手可及。"人类惰性总是赢家。"哈里森笑言。
以往要去除照片中的路灯必须购买昂贵的PS订阅服务,还需掌握基础修图知识。哈里森分析:"人们早有此类需求,但不愿耗费半小时修改单张照片。"谷歌对话编辑器不仅能调整光线、清除背景杂物、裁剪画面,甚至能实现"给帝国大厦加上攀爬的金刚"这类创意指令——当然,这也引出了生成式AI的篡改风险。哈里森承认存在争议,但认为风波终将平息。
真实性保障机制
为应对质疑,经谷歌新工具编辑的图片会添加C2PA内容凭证、IPTC元数据及SynthID水印,明确记录AI使用痕迹并追溯文件来源。这些措施可向其他图像软件清晰标示修改历史。
对话式修图实践
手机修图体验向来不佳:需滑动多重选项卡,手指难以精准调节滑块。谷歌曾推出AI一键优化功能,但效果参差不齐。对话式修图则将主导权交还用户:通过语音或文字描述预期效果。即使只会说"弄得好看点",系统也能智能优化构图、光线甚至添加景深虚化。"调整光照""去除反光"等指令同样有效。
该工具仍有局限:无法移动画面元素,修改会统一应用于全图。例如编辑妻子肖像时,我想保留身体部位的强烈阴影仅降低面部高光,结果全局高光被削弱导致下半部阴影失真(尽管面部光照确实改善)。与可局部调整的Lightroom或PS不同,用户仍需受限于谷歌相册的编辑能力。
若想清除照片中碍眼的塑料袋?直接开口。画面太满?指令"扩展一点"即可触发AI补全边缘(成功率不一)。当然,生成式AI功能全程可选。
最令我惊叹的是"修复"幼年照片的功能:系统自动清噪、润色、增强对比度。手动操作虽可行,但AI仅用数秒就完成了需耗时数分钟的工作。
人机交互的革命
所有这些功能都指向人机交互方式的飞跃。"PS只是工具,"哈里森总结道,"我将其视为加持了些许AI功能的强大工具。但过去五十年间,计算机科学家始终在思考:何时电脑才能从工具转变为伙伴?这将是计算思维的根本性变革。"
英文来源:
All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links. Learn more.
The smartphone has become the playground for new AI and generative AI features.
Apple made a significant push last year with Apple Intelligence, featuring tools like Image Playground, which allows you to create images from scratch, and Writing Tools that can rewrite and summarize text. On the latest iPhone 17 running iOS 26, machine intelligence powers the new live translation features in calls and messages. Google has many of the same features on Android; the latest Pixel 10 phones can generate a version of your voice for use in real-time language translations on calls.
As WIRED's resident smartphone reviewer, I've tested all of these phones and their hyped-up features. Very few of these capabilities have really felt like a practical, useful feature designed to make everyday life easier—something I could even see my parents using. That's what AI is supposed to do, right?
That's until I tried Google's new Ask Photos conversational editing feature in Google Photos, which debuted on the Pixel 10 phones and is now available on Android devices that can support it. The feature lets you type or speak out the visual edits you want to see in your photos without fumbling with menus and sliders. Most people have no idea how powerful the software on their phones already is, and so by being able to access all the editing tools that are available and use them to execute your desired task, this feature not only gives you the results you want in a nearly frictionless way, but it also helps you better understand what your smartphone is capable of.
Speak Your Mind
The idea of talking to a computer and having it complete tasks for you has been around for decades. Hollywood has its own idea of what this looks like (HAL 9000 in 2001: A Space Odyssey is perhaps the most iconic—and dark—depiction), but researchers have another.
A prototype app called Pixeltone developed by Adobe Research and the University of Michigan showed the possibility of using voice control and touch for photo editing. The top comment on the YouTube video demonstrating the capability is this one, left by a viewer 12 years ago: "Why so much hate? It isn't for the “real” photographer, but for my dad, that sometimes uses Photoshop; this is great."
The democratization of powerful photo editing tools has clear dangers, like the ease with which bad actors can use them to propagate disinformation and manipulate the truth. But most of today's editing tools require users to actively seek them out and require skill to use effectively. Google's conversational editor is different. It's powerful, simple, and controlled by plain English. And it's one tap away in your Google Photos library.
“For many people, ChatGPT is a fun novelty," says Chris Harrison, director of the Future Interfaces Group at Carnegie Mellon University. “Some people have adopted it into their workflows, but for the vast majority of people, it's a novelty." Harrison believes Google's new editing tool will be used far more widely—at least by anyone savvy enough to use an Instagram filter. ”AI should be making things easier to use, and this is a great example consumers will have a genuine interest in."
Clear signposting makes Google's photo editor more accessible. Many AI chatbot interfaces start with a blank textbox that offers little insight into their capabilities, and that's no help to people who are unsure where to start. But having the conversational editor pop up as soon as you tap Edit on Google Photos makes it immensely easier to use, because it's right there after you've already established context that you're editing a photo. "Human laziness always wins,” Harrison says.
You've always been able to go into Adobe Photoshop and paint out a street lamp from a photo, but Photoshop subscriptions are pricey, and the tools require a base-level understanding of photo editing, not to mention familiarity with Photoshop's capabilities. “People probably wanted this feature beforehand, but didn't want to have the cost of going into Photoshop and blowing half an hour to modify one photo.”
Google's conversational editor goes past the usual edits like fixing the lighting, erasing plastic trash bags from the background, and cropping. You can ask it to “Add King Kong climbing the Empire State building,” and voilà. It can erase people from photos.
That brings us back to the threats of manipulation that these generative AI features present. Harrison acknowledges the pushback but believes it will largely blow over.
“That's what people have been doing with their smartphone-captured photographs since the beginning of time," he says. "If anyone thinks Instagram is real life, they're in for a rude awakening. This is just a new tool; it's not a new concept, it's just a more powerful version of what has existed.”
To address these concerns, images edited with Google's new tool have C2PA content credentials, IPTC metadata, and SynthID to watermark and log the use of AI in media and trace the file's origin. These steps make it clear to other image editing software and diagnostic tools that the photos have been edited.
Conversational Editing
Editing pictures on a smartphone isn't very fun. There are multiple tabs you have to swipe through, and sliders can be hard to precisely move with your finger. Google has experimented with AI-powered edits before—a single tap to have the algorithm edit the photo to what it thinks you want—but the results can be hit or miss.
With conversational editing, you're in control. Just tell the textbox, either via voice or typing, what you want to see in the image. And if you don't know the words to use, I've been experimenting with “make it look better” and gotten pretty good results. I've seen the tool adjust crops, improve lighting, and even add a portrait blur effect. “Fix the lighting” or “remove the reflections” also work really well.
The tool isn't perfect. It can't perform some actions, like move subjects around the frame, and edits are unilaterally applied to the whole image. For example, when editing a portrait of my wife, I wanted to retain the stark shadows on her body but bring down the highlights on her face. Google Photos just reduced the highlights across the board, ruining the shadows in the bottom half of the frame (though it did improve the lighting on her face). Unlike Lightroom or Photoshop, where you can control exactly where you want to adjust these parameters, you're limited by the editing capabilities of Google Photos.
Got an unsightly plastic bag in the photo? Ask to remove it. Is the photo too cropped in? You can ask to expand it a bit more, and Google will use generative AI to fill the new extra space with what it thinks should be there (with varying degrees of success). If you don't want to use these generative AI editing features, you don't have to.
Perhaps most impressive when I asked it to “restore" a photo from when I was a baby. It cleaned up the image, improved colors, and boosted contrast. Could I have done it myself? Sure, but it would have taken me several minutes, and this was done in seconds.
All of these capabilities point to the next leap in how we interact with computers. “Photoshop is a tool,” Harrison says. “I'm using it as a very powerful tool with maybe a sprinkling of AI features. But computer scientists have been really thinking about this for the past half-century: When is this change going to happen from computers as tools to computers as partners, and it's a really seminal shift in how we think about computing.”