VaultGemma：全球首款具备差分隐私能力的大型语言模型

qimuai 发布于 2025-9-13 08:02 阅读：7 一手编译

内容来源：https://research.google/blog/vaultgemma-the-worlds-most-capable-differentially-private-llm/

内容总结：

谷歌研究团队于2025年9月12日正式发布全球首个基于差分隐私技术从头训练的大型语言模型VaultGemma。该模型参数量达10亿，是目前规模最大、能力最强的开源差分隐私语言模型，其权重及技术报告已通过Hugging Face和Kaggle平台向全球开放。

研究团队与Google DeepMind合作开展的《差分隐私语言模型的扩展定律》研究，首次系统揭示了计算资源、隐私保护与模型效用之间的平衡关系。研究表明，差分隐私技术的应用会改变传统模型扩展规律，需通过大幅增加批量大小与计算成本来维持训练稳定性。团队提出"噪声-批量比"核心概念，并基于此推导出可精确预测训练损失的数学模型。

VaultGemma基于Gemma 2架构构建，采用序列级差分隐私保障（ε≤2.0，δ≤1.1e-10）。在标准学术基准测试中，其性能已达到五年前非隐私保护模型（如GPT-2）的水平。实证测试显示，该模型未出现训练数据记忆现象，证实了差分隐私训练的有效性。

尽管当前差分隐私模型与非隐私模型仍存在性能差距，但该项研究为隐私保护AI的发展提供了重要技术路线图。谷歌团队表示，将继续通过机制设计研究系统性地缩小这一差距，推动构建既强大又尊重隐私的新一代人工智能。

中文翻译：

VaultGemma：全球性能最强的差分隐私大语言模型
2025年9月12日
谷歌研究院软件工程师Amer Sinha与研究员Ryan McKenna

我们正式推出VaultGemma——这是目前通过差分隐私技术从头训练的性能最强的模型。

快速入口
随着人工智能日益融入日常生活，构建以隐私为核心的人工智能系统已成为该领域的关键前沿课题。差分隐私（DP）通过添加校准噪声来防止记忆效应，提供了数学层面严谨的解决方案。然而将DP应用于大语言模型会带来多重权衡：注入DP噪声会改变传统缩放定律（描述性能动态的规则），具体表现为降低训练稳定性（模型持续学习而不遭遇损失峰值或发散等灾难性事件的能力），并显著增加批量大小（同时发送给模型处理的输入提示集合）与计算成本。

我们与Google DeepMind合作开展的新研究《差分隐私语言模型的缩放定律》，建立了能精准模拟这些复杂性的定律，全面揭示了计算-隐私-效用三者间的权衡关系。基于这项研究的指导，我们激动地推出VaultGemma——目前参数量最大（10亿参数）且完全通过差分隐私从头训练的开源模型。我们将通过Hugging Face和Kaggle平台发布模型权重与技术报告，以推动下一代隐私保护型人工智能的发展。

理解缩放定律
通过精心设计的实验方法，我们旨在量化DP训练中扩大模型规模、批量大小和迭代次数的收益。研究过程中我们通过简化假设来应对指数级增长的参数组合可能性，主要假设模型学习效果主要取决于"噪声-批量比"（即隐私保护添加的随机噪声量与训练数据组/批量大小的比值）。这一假设成立的原因在于：我们所添加的隐私噪声远超过数据采样产生的自然随机性。

为建立DP缩放定律，我们进行了全面实验以评估不同模型规模和噪声-批量比的性能表现。最终获得的实证数据结合已知变量间的确定性关系，使我们能解答各类缩放定律式问题，例如："在给定计算预算、隐私预算和数据预算的前提下，如何配置最优训练方案以实现最低训练损失？"

关键发现：强大的协同效应
在深入探讨完整缩放定律前，从隐私核算角度理解计算预算、隐私预算和数据预算之间的动态协同关系至关重要——即在固定模型规模和迭代次数下，这些因素如何影响噪声-批量比。此类分析无需模型训练即可完成，成本低廉却能产生重要洞察。例如：单独增加隐私预算的收益会递减，除非同步增加计算预算（浮点运算次数）或数据预算（词元数量）。

为进一步探索这种协同效应，下图可视化展示了不同约束条件下最优训练配置的变化。随着隐私预算与计算预算的变动，请注意系统推荐方案如何在扩大模型规模与增加批量大小/迭代次数之间动态调整。

这些数据为实践者提供了丰富洞察。虽然所有发现已在论文中详述，但关键结论是：相较于非DP训练，DP训练应采用更小的模型配合更大的批量大小。对于DP专家而言，这一发现并不意外，因为大批量训练在DP领域至关重要。虽然这一原则适用于多数场景，但最优训练配置会随隐私预算和数据预算变化。准确理解这种权衡对于在实际训练中合理利用计算与隐私预算至关重要。可视化分析还表明训练配置存在灵活调整空间——即当搭配恰当的迭代次数和/或批量大小时，不同规模的模型可能产生非常接近的效用。

应用缩放定律构建VaultGemma
Gemma模型以责任与安全为核心设计理念，这使其成为开发生产级DP训练模型（如VaultGemma）的理想基础。

算法进阶：大规模训练
我们推导的缩放定律是训练实用化DP Gemma模型的重要第一步。运用这些定律，我们确定了两个关键问题：训练计算最优的10亿参数DP版Gemma 2模型所需的计算资源量，以及如何分配计算资源至批量大小、迭代次数和序列长度以实现最佳效用。

缩放定律研究与VaultGemma实际训练间的一个显著差距是对泊松采样的处理（这是DP-SGD的核心组件）。我们最初采用均匀批量加载数据的方法，后转为泊松采样以实现最小噪声下的最佳隐私保障。这种方法带来两大挑战：生成不同尺寸的批次，以及需要特定的随机化数据处理顺序。我们通过近期研发的"可扩展DP-SGD"技术解决了该问题，该技术支持固定尺寸批次处理（通过填充或裁剪实现），同时保持强大的隐私保护。

成果
凭借新的缩放定律与先进训练算法，我们成功构建了VaultGemma——迄今为止参数量最大（10亿参数）且完全通过差分隐私预训练的开源模型，其方法能产生高效用模型。

通过VaultGemma的训练实践，我们验证了缩放定律的高度准确性。最终训练损失与公式预测值高度吻合，这不仅证实了研究的可靠性，更为社区提供了未来私有模型开发的可靠路线图。

在一系列标准学术基准测试中（包括HellaSwag、BoolQ、PIQA、SocialIQA、TriviaQA、ARC-C、ARC-E），我们将模型与未使用隐私保护的对应版本进行下游性能对比。为客观评估性能并量化当前隐私保护所需的资源投入，我们还与早期同规模GPT-2模型进行了对比（这些模型在基准测试中表现相近）。对比表明：当前私有训练方法产生的模型效用相当于约5年前的非私有模型，这凸显了我们的工作将帮助系统化缩小的重要差距。

最终，该模型具备强大的理论与实证隐私保护能力。

形式化隐私保障
通常而言，隐私参数（ε, δ）和隐私单元都是DP训练的重要考量因素，它们共同决定了训练模型所能学习的内容。VaultGemma采用序列级DP保障（ε ≤ 2.0, δ ≤ 1.1e-10），其中序列指从异构数据源提取的1024个连续词元。具体而言，我们采用与Gemma 2模型相同的训练数据混合方案，包含多种不同长度的文档。预处理过程中，长文档被分割并标记为多个序列，短文档则打包为单一序列。虽然序列级隐私单元适用于我们的训练方案，但在数据与用户存在明确映射的场景中，用户级差分隐私会是更优选择。

这对实践意味着什么？非正式地说，由于我们提供序列级保护，如果任何（可能敏感的）事实或推论信息仅存在于单个序列中，那么VaultGemma本质上不会知晓该信息：对任何查询的响应都将与从未训练该序列的模型在统计上相似。但若多个训练序列包含特定事实的相关信息，则VaultGemma通常能够提供该信息。

实证记忆测试
序列级DP可证明约束任何单个训练序列（样本）对最终模型的影响。我们使用训练文档中的50词元前缀提示模型，观察其是否会生成对应的50词元后缀。VaultGemma 10亿参数版本未检测到任何训练数据记忆，成功证明了DP训练的有效性。

结论
VaultGemma标志着我们在构建强大且原生隐私保护的人工智能道路上迈出重要一步。通过开发并应用对DP缩放定律的创新性理解，我们成功训练并发布了目前最大的开源DP训练语言模型。

虽然DP训练与非DP训练模型间仍存在效用差距，但我们相信通过更多DP训练机制设计的研究，这一差距能被系统化缩小。我们希望VaultGemma及相关研究能助力社区为所有人构建下一代安全、负责任且保护隐私的人工智能。

致谢
感谢整个Gemma团队和谷歌隐私团队在本项目中的贡献与支持，特别感谢Peter Kairouz、Brendan McMahan和Dan Ramage对博文的反馈，Mark Simborg和Kimberly Schwede的可视化支持，以及参与算法设计、基础设施实现和生产维护的谷歌团队。以下人员对本文工作做出直接贡献（按字母顺序排列）：Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Lynn Chua, Prem Eruvbetine, Badih Ghazi, Steve He, Yangsibo Huang, Armand Joulin, George Kaissis, Pritish Kamath, Ravi Kumar, Daogao Liu, Ruibo Liu, Pasin Manurangsi, Thomas Mesnard, Andreas Terzis, Tris Warkentin, Da Yu, Chiyuan Zhang。

英文来源：

VaultGemma: The world's most capable differentially private LLM
September 12, 2025
Amer Sinha, Software Engineer, and Ryan McKenna, Research Scientist, Google Research
We introduce VaultGemma, the most capable model trained from scratch with differential privacy.
Quick links
As AI becomes more integrated into our lives, building it with privacy at its core is a critical frontier for the field. Differential privacy (DP) offers a mathematically robust solution by adding calibrated noise to prevent memorization. However, applying DP to LLMs introduces trade-offs. Understanding these trade-offs is crucial. Applying DP noise alters traditional scaling laws — rules describing performance dynamics — by reducing training stability (the model's ability to learn consistently without experiencing catastrophic events like loss spikes or divergence) and significantly increasing batch size (a collection of input prompts sent to the model simultaneously for processing) and computation costs.
Our new research, “Scaling Laws for Differentially Private Language Models”, conducted in partnership with Google DeepMind, establishes laws that accurately model these intricacies, providing a complete picture of the compute-privacy-utility trade-offs. Guided by this research, we’re excited to introduce VaultGemma, the largest (1B-parameters), open model trained from scratch with differential privacy. We are releasing the weights on Hugging Face and Kaggle, alongside a technical report, to advance the development of the next generation of private AI.
Understanding the scaling laws
With a carefully thought-out experimental methodology, we aimed to quantify the benefit of increasing model sizes, batch sizes, and iterations in the context of DP training. Our work required making some simplifying assumptions to overcome the exponential number of combinations one might consider trying. We assumed that how well the model learns depends mostly on the "noise-batch ratio” which compares the amount of random noise we add for privacy to the size of the data groups (batches) we use for training. This assumption works because the privacy noise we add is much greater than any natural randomness that comes from sampling the data.
To establish a DP scaling law, we conducted a comprehensive set of experiments to evaluate performance across a variety of model sizes and noise-batch ratios. The resulting empirical data, together with known deterministic relationships between other variables, allows us to answer a variety of interesting scaling-laws–style queries, such as, “For a given compute budget, privacy budget, and data budget, what is the optimal training configuration to achieve the lowest possible training loss?”
Key findings: A powerful synergy
Before diving into the full scaling laws, it’s useful to understand the dynamics and synergies between the compute budget, privacy budget, and data budget from a privacy accounting perspective — i.e., understand how these factors influence the noise-batch ratio for a fixed model size and number of iterations. This analysis is significantly cheaper to do as it does not require any model training, yet it yields a number of useful insights. For instance, increasing the privacy budget in isolation leads to diminishing returns, unless coupled with a corresponding increase in either the compute budget (FLOPs) or data budget (tokens).
To explore this synergy further, the visualization below shows how the optimal training configuration changes based on different constraints. As the privacy and compute budgets change, notice how the recommendation shifts between investing in a larger model versus training with larger batch sizes or more iterations.
This data provides a wealth of useful insights for practitioners. While all the insights are reported in the paper, a key finding is that one should train a much smaller model with a much larger batch size than would be used without DP. This general insight should be unsurprising to a DP expert given the importance of large batch sizes. While this general insight holds across many settings, the optimal training configurations do change with the privacy and data budgets. Understanding the exact trade-off is crucial to ensure that both the compute and privacy budgets are used judiciously in real training scenarios. The above visualizations also reveal that there is often wiggle room in the training configurations — i.e., a range of model sizes might provide very similar utility if paired with the correct number of iterations and/or batch size.
Applying the scaling laws to build VaultGemma
The Gemma models are designed with responsibility and safety at their core. This makes them a natural foundation for developing a production-quality, DP-trained model like VaultGemma.
Algorithmic advancements: Training at scale
The scaling laws we derived above represent an important first step towards training a useful Gemma model with DP. We used the scaling laws to determine both how much compute we needed to train a compute-optimal 1B parameter Gemma 2-based model with DP, and how to allocate that compute among batch size, iterations, and sequence length to achieve the best utility.
One prominent gap between the research underlying the scaling laws and the actual training of VaultGemma was our handling of Poisson sampling, which is a central component of DP-SGD. We initially used a straightforward method of loading data in uniform batches but then switched to Poisson sampling to get the best privacy guarantees with the least amount of noise. This method posed two main challenges: it created batches of different sizes, and it required a specific, randomized order for processing the data. We solved this by using our recent work on Scalable DP-SGD, which allows us to process data in fixed-size batches — either by adding extra padding or trimming them — while still maintaining strong privacy protections.
Results
Armed with our new scaling laws and advanced training algorithms, we built VaultGemma, to date the largest (1B-parameters) open model fully pre-trained with differential privacy with an approach that can yield high-utility models.
From training VaultGemma, we found our scaling laws to be highly accurate. The final training loss of VaultGemma was remarkably close to what our equations predicted, validating our research and providing the community with a reliable roadmap for future private model development.
We also compare downstream performance of our model against its non-private counterpart across a range of standard academic benchmarks (i.e., HellaSwag, BoolQ, PIQA, SocialIQA, TriviaQA, ARC-C, ARC-E ). To put this performance in perspective and quantify the current resource investment required for privacy, we also include a comparison to an older similar-sized GPT-2 model, which performs similarly on these benchmarks. This comparison illustrates that today’s private training methods produce models with utility comparable to that of non-private models from roughly 5 years ago, highlighting the important gap our work will help the community systematically close.
Finally, the model comes with strong theoretical and empirical privacy protections.
Formal privacy guarantee
In general, both the privacy parameters (ε, δ) and the privacy unit are important considerations when doing DP training, as these together determine what the trained model can learn. VaultGemma was trained with a sequence-level DP guarantee of (ε ≤ 2.0, δ ≤ 1.1e-10), where a sequence consists of 1024 consecutive tokens extracted from heterogeneous data sources. Specifically, we used the same training mixture that was used to train the Gemma 2 model, consisting of a number of documents of varying lengths. During pre-processing, long documents are split up and tokenized into multiple sequences, and shorter documents are packed together into a single sequence. While the sequence-level privacy unit was a natural choice for our training mixture, in situations where there is a clear mapping between data and users, user-level differential privacy would be a better choice.
What does this mean in practice? Informally speaking, because we provide protection at the sequence level, if information relating to any (potentially private) fact or inference occurs in a single sequence, then VaultGemma essentially does not know that fact: the response to any query will be statistically similar to the result from a model that never trained on the sequence in question. However, if many training sequences contain information relevant to a particular fact, then in general VaultGemma will be able to provide that information.
Empirical memorization
Sequence-level DP provably bounds the influence of any single training sequence (example) on the final model. We prompted the model with a 50-token prefix from a training document to see if it would generate the corresponding 50-token suffix. VaultGemma 1B shows no detectable memorization of its training data and successfully demonstrates the efficacy of DP training.
Conclusion
VaultGemma represents a significant step forward in the journey toward building AI that is both powerful and private by design. By developing and applying a new, robust understanding of the scaling laws for DP, we have successfully trained and released the largest open, DP-trained language model to date.
While a utility gap still exists between DP-trained and non-DP–trained models, we believe this gap can be systematically narrowed with more research on mechanism design for DP training. We hope that VaultGemma and our accompanying research will empower the community to build the next generation of safe, responsible, and private AI for everyone.
Acknowledgements
We'd like to thank the entire Gemma and Google Privacy teams for their contributions and support throughout this project, in particular, Peter Kairouz, Brendan McMahan and Dan Ramage for feedback on the blog post, Mark Simborg and Kimberly Schwede for help with visualizations, and the teams at Google that helped with algorithm design, infrastructure implementation, and production maintenance. The following people directly contributed to the work presented here (ordered alphabetically): Borja Balle, Zachary Charles, Christopher A. Choquette-Choo, Lynn Chua, Prem Eruvbetine, Badih Ghazi, Steve He, Yangsibo Huang, Armand Joulin, George Kaissis, Pritish Kamath, Ravi Kumar, Daogao Liu, Ruibo Liu, Pasin Manurangsi, Thomas Mesnard, Andreas Terzis, Tris Warkentin, Da Yu, and Chiyuan Zhang.

谷歌研究进展

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读