确保人工智能使用的可验证隐私洞察

内容来源:https://research.google/blog/toward-provably-private-insights-into-ai-use/
内容总结:
谷歌研发团队日前宣布,成功构建全球首个"可验证隐私洞察"系统。该系统通过融合大语言模型、差分隐私与可信执行环境三大技术,实现在不触及用户原始数据的前提下分析生成式AI工具使用情况,为隐私保护技术树立新标杆。
这一突破性技术已应用于Pixel手机的"录音机"应用。当用户开启"为所有人改进功能"选项后,系统会使用开源Gemma模型对转录文本进行加密分析,自动识别内容类别(如"会议记录""个人备忘"等),再通过差分隐私技术生成统计直方图。整个过程确保原始数据始终处于可信执行环境的加密保护中,仅输出经匿名化处理的聚合结果。
技术核心在于"机密联合分析"框架的升级。该框架将开源分析软件部署于可信执行环境,通过密钥管理服务严格限定数据处理流程,并借助透明日志公开验证系统运行状态。目前相关代码已在GitHub平台开源,允许第三方机构完整验证隐私声明。
业内专家指出,这项技术为处理敏感数据的生成式AI应用提供了可验证的隐私保护方案,既保障开发者能通过自动评分机制优化模型性能,又确保用户个体数据不被窥探。随着未来对张量处理单元等高性能硬件的适配,该系统将支持更复杂的隐私保护分析任务。
中文翻译:
迈向可验证隐私保护的人工智能应用洞察
2025年10月30日
Artem Lagzdin(软件工程师)与Daniel Ramage(研究总监),谷歌研究院
本文将详解我们如何运用机密联邦分析技术来理解设备端生成式AI功能,确保用户数据处理与分析过程具有高度透明性。
生成式AI(GenAI)能够实现个性化体验,并推动包括摘要、转录等非结构化数据的创建。通过了解AI在现实场景中的应用情况[1, 2],开发者可以掌握常见使用模式并识别故障类型,从而优化工具性能。特别是在处理设备端数据时,我们的目标是在洞察生成过程中提供日益强大的隐私保障。本文介绍的可验证隐私洞察(PPI)技术,将成为在保障个体数据不可查验、聚合洞察完全匿名的前提下,动态分析大语言模型与生成式AI工具使用情况的新标杆。
今日我们率先推出业界首创的PPI系统,它融合了大语言模型(LLM)、差分隐私(DP)与可信执行环境(TEE)三大技术,专门用于分析非结构化生成式AI数据。该系统证实服务器端处理仅限执行隐私保护计算,且支持完全外部审查。借助该体系,开发者可通过专设的“数据专家”LLM分析用户交互,例如回答“讨论主题是什么?”或“用户是否感到困扰?”等问题。LLM的答复经差分隐私聚合后,既能全面展现生成式AI功能的整体使用情况,又不会暴露未聚合的原始数据。这位“数据专家”LLM本身驻留于TEE内部。PPI通过机密联邦分析(CFA)技术实现——该技术最初部署于Gboard输入法,其开源分析软件在TEE中运行,确保服务器端数据处理的运行机制与隐私特性完全透明。我们在Pixel录音机应用中部署的PPI系统,采用谷歌最新开源Gemma模型作为“数据专家”,为录音功能使用情况提供深度洞察。
为鼓励外界验证我们的技术声明,我们已将基于LLM的隐私保护洞察功能开源,作为谷歌Parfait中机密联邦分析的组成部分,同时公开了整套TEE托管的机密联邦分析技术栈。
可验证隐私洞察的实现原理
谷歌CFA运用机密计算技术在处理过程中保护未聚合的用户数据,仅发布具有严格(用户级)差分隐私保障的输出结果。无论分析师执行何种查询,CFA都能提供强大的数据隔离与匿名化保证。
该技术流程中,用户设备首先判定需上传分析的数据内容。设备对数据进行加密上传时,会同步授权服务器可执行的数据解密处理步骤。上传数据由TEE托管的密钥管理服务进行保护,该服务仅向经设备认证的处理步骤释放解密密钥。设备可验证该密钥管理服务确系预期的开源代码(已录入防篡改的公开透明日志Rekor),并确认代码运行在谷歌无法访问的正确配置TEE中。密钥管理服务则会验证经批准的公共处理步骤确实在TEE内执行。这意味着数据不会被用于其他分析用途,任何人员均无法获取单个设备的数据。
隐私洞察通过明确定义的处理流程生成:首先由LLM对非结构化原始数据进行分析,提取特定问题(如输入内容的分类或主题)的答案,完成“结构化摘要”。处理流程首先使用开源Gemma 3模型将转录文本归入目标类别,随后对这些分类结果进行差分隐私加噪统计,生成主题分布直方图,确保输出结果不受单个用户的强烈影响。由于差分隐私保障适用于聚合算法本身,开发者可频繁调整LLM提示词——即使提示词设计可能定位到特定用户,差分隐私统计结果也不会泄露该信息。
我们系统中所有涉及隐私的组件——从隐私聚合算法到TEE技术栈——均为开源且可复现构建,LLM本身同样开源。用于数据分析的工作流签名亦对外公开。结合TEE对软件运行状态的验证能力,数据处理流程的每个环节均可溯源至已发布代码。这为外部机构验证我们的隐私声明提供了可能。这种对端到端可验证性的坚持,正是系统实现“可验证”目标的核心——我们以此能力为基石,允许第三方审查开源代码并确认其与实际运行代码完全一致,从而向客户证明其数据仅会通过该代码进行处理(需考虑当代TEE的已知缺陷)。
简言之,可验证隐私洞察可通过机密联邦分析中基于LLM的结构化摘要工作流实现。结构化摘要与差分隐私直方图生成的结合,既能深入理解生成式AI工具的实际应用场景,又始终确保隐私安全。系统技术细节详见白皮书。
可验证隐私洞察在录音机应用中的实践
Pixel设备上的谷歌录音机应用提供转录、摘要、发言人标记等强大AI功能。开发者面临的关键挑战在于理解用户与这些功能的交互方式。例如,用户是在创建“个人备忘录”、“提醒事项”,还是录制“商务会议”?传统计数统计分析方法若缺少结构化摘要或其他分类手段,则难以有效解析此类数据。传统方案通常需将转录文本上传至中央服务器进行分类,再对结果执行差分隐私计数查询。PPI以类似方式运作,但彻底杜绝了数据被挪作他用的风险。
在录音机应用中,部分转录文本(来自已启用“为所有人改进”设置的用户)会通过公钥加密,该公钥由中央TEE托管密钥库管理,并经由基于AMD SEV-SNP芯片的谷歌Project Oak认证栈保护。密钥库确保上传数据仅能被预先批准的处理步骤解密,这些步骤本身也经过验证确认在TEE中执行预期操作。运行于AMD SEV-SNP TEE内的Gemma 3 4B模型将转录文本按主题分类,随后进行差分隐私聚合。外部机构可验证原始转录文本从未离开TEE安全环境,仅汇总输出的分类私有计数结果会发送至谷歌。
PPI还能助力评估设备端生成式AI功能的性能,例如录音机生成摘要的准确度。相较于仅依赖可能偏离实际场景的合成数据,CFA可将LLM自动评分器作为结构化摘要组件运行。该评分器LLM同样驻留于TEE内,能够评估设备端模型的输出结果,实现更精准且保护隐私的性能评估。这使得开发者可根据真实用户交互优化设备端模型,同时不损害个体隐私。
我们在录音机中运行的配置已发布至GitHub代码库,通过遵循相关说明可关联至具体代码路径与隐私保障。该配置保证无论运行何种LLM查询,都会经过自动调优的差分隐私直方图聚合器处理,并遵循严格的隐私保障(上图采用的用户级ε值为1)。
未来展望
本项工作证实可验证隐私洞察的可行性:通过LLM分析现实场景中的生成式AI工具使用数据,再聚合为差分隐私统计结果,且服务器端处理步骤完全透明。洞察生成流程的每个环节均采用业界领先的数据隔离与匿名化设计,外部验证者可直接查验方法源代码及其运行证明。
我们率先开源了基于LLM的结构化摘要应用。预计后续将涌现包括差分隐私聚类、合成数据生成在内的更多应用,且均保持同等水平的可验证性与机密性。随着未来支持更高吞吐量加速器(如谷歌TPU)的机密计算,更丰富的分析场景——包括详细转录分析和自动评分——将成为可能。如今我们已能在机密计算边界内生成洞察,无需暴露敏感用户数据,并为生成的洞察提供强大的用户级差分隐私保障。我们欣喜地看到,正当生成式AI工具开始应用于设备端及敏感数据场景之际,可验证隐私洞察技术也日趋成熟。
致谢
我们要感谢谷歌内部在算法设计、基础设施实现与系统运维中提供支持的团队,特别是由Marco Gruteser、Peter Kairouz和Timon Van Overveldt带领的团队,以及产品经理Prem Eruvbetine与其团队成员:Albert Cheu、Brett McLarnon、郑春祥、Edo Roth、Emily Glanz、倪Grace、James Bell-Clark、Katharine Daly、Krzysztof Ostrowski、Maya Spivak、Mira Holford、Nova Fallen、Rakshita Tandon、任毅、Stanislav Chiknavaryan、Stefan Dierauf、Steve He和巩Zoe。同时感谢通过技术集成与录音机适配支持本系统的紧密合作伙伴:Allen Su、Austin Hsu、Console Chen、Daniel Minare Ho、Dennis Cheng、Jan Wassenberg、Kristi Bradford、李玲、Mina Askari、Miranda Huang、Tam Le、陈耀南和姚志敏。本项目还获得Corinna Cortes、Jay Yagnik、Ramanan Rajeswaran、Seang Chau和Yossi Matias的支持。此外感谢Peter Kairouz、Marco Gruteser、Mark Simborg和Kimberly Schwede对本文撰写的建议与贡献。
英文来源:
Toward provably private insights into AI use
October 30, 2025
Artem Lagzdin, Software Engineer, and Daniel Ramage, Research Director, Google Research
We detail how confidential federated analytics technology is leveraged to understand on-device generative AI features, ensuring strong transparency in user data handling and analysis.
Generative AI (GenAI) enables personalized experiences and powers the creation of unstructured data, including summaries, transcriptions, and more. Insights into real-world AI use [1, 2] can help GenAI developers enhance their tools by understanding common applications and identifying failure modes. And especially when those tools are applied to on-device data, our goal is to offer increasingly robust privacy guarantees during the insight generation process. This post introduces provably private insights (PPI), a new north star for generating dynamic insights into how people use LLMs and GenAI tools while guaranteeing that individual data is not inspectable and that aggregate insights are anonymous.
Today we announce a first-of-its kind PPI system that leverages the power of large language models (LLMs), differential privacy (DP), and trusted execution environments (TEEs) to analyze unstructured GenAI data. This system proves that server-side processing is limited to privacy-preserving computations and can be fully externally inspected. With our system, GenAI tool developers can analyze interactions using a “data expert” LLM, tasked with answering questions like “what topic is being discussed?” or “is the user frustrated?” The LLM’s answers are aggregated with DP to provide a comprehensive view of GenAI feature usage across the user population without exposing unaggregated data. The “data expert” LLM itself resides within the TEE. PPI is enabled by confidential federated analytics (CFA), a technique first deployed in Gboard, where open source analysis software runs in TEEs, offering complete transparency into the mechanisms and privacy properties of server-side data processing. Our deployment of PPI in the Recorder application for Pixel leverages Google’s latest open-source Gemma models as the “data expert” to offer insights into Recorder usage.
To encourage the external community to verify our claims, we’ve open-sourced LLM-powered privacy preserving insights as part of confidential federated analytics in Google Parfait, along with the rest of our TEE-hosted confidential federated analytics stack.
How provably private insights are possible
Google’s CFA leverages confidential computing to protect unaggregated user data during processing, and only releases outputs with a formal (user-level) DP guarantee. CFA provides strong data isolation and anonymization guarantees regardless of what query an analyst runs.
In this technique, user devices first decide what data should be uploaded for analysis. Devices encrypt and upload this data, along with the processing steps that the server is authorized to use for decryption. Uploaded data is protected with encryption keys managed by a TEE-hosted key management service, which releases decryption keys only to device-approved processing steps. Devices can verify that the key management service is the expected open source code (included in a public, tamper-resistant transparency log, Rekor), and that the code is running in a properly configured TEE that is inaccessible to Google. The key management service in turn verifies that the approved, public processing steps are running in TEEs. No other analyses can be performed on the data and no human can access data from individual devices.
Private insights are derived by passing the data through a well-defined set of processing steps. First, unstructured raw data is analyzed by an LLM tasked with extracting the answer to a specific question, such as the category or topic of the input (“structured summarization”). Processing begins by using an open-source Gemma 3 model to classify transcripts into categories of interest. These classes are then summed to compute a histogram of topics with differentially private noise guaranteeing that the output histogram cannot be strongly influenced by any one user. The LLM’s prompt can be changed frequently, because the DP guarantee applies to the aggregation algorithm regardless of the LLM prompt. Even if the developer asked a question designed to single out one user, the differential private statistics would not reveal it.
All privacy-relevant parts of our system are open source and reproducibly buildable — from the private aggregation algorithm to the TEE stack — and the LLM itself is also open source. The signatures of the workflows used to analyze the data are also public. When combined with TEEs' ability to attest to the state of the system running the software, every part of the data processing pipeline can be verifiably linked to published code. This provides external parties the ability to verify our privacy claims. This commitment to end-to-end verifiability is how the system makes progress toward being provable — we anchor on this capability, allowing third parties to inspect the open-source code and confirm that it is exactly the code we claim to run, thereby proving to clients that this is the only code their data will be processed with, subject to known weaknesses in current-generation TEEs.
In short, provably private insights can be generated by an LLM-powered structured summarization workflow in confidential federated analytics. The combination of structured summarization with differentially private histogram generation enables deeper understanding into how the GenAI tools are used in the real world, all while guaranteeing privacy. Technical details of the system can be found in the whitepaper.
How provably private insights are used in Recorder
Google’s Recorder app on Pixel offers powerful AI features, such as transcription, summarization, and speaker labeling. A key challenge for the application developers is to understand how users interact with these features. For instance, are users creating "Notes to self," "Reminders," or recording "Business meetings"? Traditional count-based analytics are insufficient to analyze such data without the help of structured summarization or another form of classification. In a traditional setting, a system would log these transcripts to a central server for classification, and then run (differentially) private count queries on the results. PPI operates in a similar way but without the risk of data being used for any other purpose.
In the Recorder application, a subset of transcripts (from users who have enabled “Improve for everyone” in settings) are encrypted with a public key managed by a central TEE-hosted keystore protected via Google’s Project Oak attestation stack running on AMD Secure Encrypted Virtualization-Secure Nested Paging (SEV-SNP) CPUs. The keystore ensures that the uploaded data can be decrypted only by pre-approved processing steps, themselves attested to running the expected processing steps in TEEs. A Gemma 3 4B model running within the AMD SEV-SNP TEE classifies the transcripts into topics, which are then aggregated with differential privacy. External parties can verify that raw transcripts never leave the secure environment of the TEE, and only private sums of the summarized output categories are released to Google.
PPI can also help evaluate the performance of on-device GenAI features, such as the accuracy of summaries generated by Recorder. Instead of relying solely on synthetic data, which may not accurately represent real-world use, CFA can run an LLM auto-rater as a part of the structured summarization component. This auto-rater LLM also resides within the TEE and can assess the results of the on-device model, ensuring a more accurate and privacy-preserving evaluation. This allows developers to fine-tune the on-device model based on real user interactions without compromising individual privacy.
The configuration we’re running in Recorder is available in our GitHub repository which can be connected to the specific code paths and privacy guarantees by following these instructions. The Recorder configuration guarantees that whatever LLM query is run, it is passed through the auto-tuned DP histogram aggregator with strict privacy guarantees (user-level ε = 1 used in the figure above).
What’s next?
This work demonstrates that provably private insights are possible: real-world GenAI tool use is analyzed with LLMs and then aggregated into differentially private statistics, all with full transparency into the server-side processing steps. Every step of the insight generation process has been designed to offer state-of-the-art data isolation and anonymization, and external verifiers can check the source code of the methods and the proof that we run them.
Moreover, we’ve shared LLM-powered structured summarization as a first application. We expect others, including differentially private clustering and synthetic data generation to follow, all with the same level of verifiability and confidentiality. And with future work to enable confidential use of higher-throughput accelerators such as Google TPUs, richer analyses will become possible, including detailed transcript analysis and auto-rating. Insight generation is now possible without exposing sensitive user data outside of the confidential computation boundary, and with strong user-level DP guarantees for generated insights. We are excited that the technology for provably private insights is maturing just as GenAI tools are beginning to apply to on-device and sensitive-data experiences.
Acknowledgements
We thank the teams at Google that helped with algorithm design, infrastructure implementation, and production maintenance of this system, in particular teams led by Marco Gruteser, Peter Kairouz, and Timon Van Overveldt, with product manager Prem Eruvbetine, including: Albert Cheu, Brett McLarnon, Chunxiang (Jake) Zheng, Edo Roth, Emily Glanz, Grace Ni, James Bell-Clark, Katharine Daly, Krzysztof Ostrowski, Maya Spivak, Mira Holford, Nova Fallen, Rakshita Tandon, Ren Yi, Stanislav Chiknavaryan, Stefan Dierauf, Steve He, and Zoe Gong. We also thank close partners who supported this system through technologies and the Recorder integration, including: Allen Su, Austin Hsu, Console Chen, Daniel Minare Ho, Dennis Cheng, Jan Wassenberg, Kristi Bradford, Ling Li, Mina Askari, Miranda Huang, Tam Le, Yao-Nan Chen, and Zhimin Yao. This work was supported by Corinna Cortes, Jay Yagnik, Ramanan Rajeswaran, Seang Chau, and Yossi Matias. We additionally thank Peter Kairouz, Marco Gruteser, Mark Simborg, and Kimberly Schwede for feedback and contributions to the writing of this post.