IBM发布迄今最小AI模型

qimuai 发布于 2025-10-31 11:00 阅读：125 一手编译

IBM发布迄今最小AI模型

内容来源：https://aibusiness.com/foundation-models/ibm-releases-smallest-model-nano

内容总结：

IBM发布迄今最小AI模型Granite 4.0 Nano，以不足20亿参数实现边缘端智能突破

近日，IBM正式推出Granite 4.0 Nano系列AI模型，其参数规模最小仅3.5亿，最大不足20亿，成为该公司迄今最轻量化的人工智能模型。该系列包含基于混合状态空间架构（SSM）与纯Transformer架构的四种模型规格，现已在开源平台Hugging Face上线并采用Apache 2.0开源协议。

与传统追求超大规模参数的行业趋势不同，IBM此次通过紧凑型模型设计印证了"模型智能程度不与参数规模正相关"的技术理念。研发团队在公开声明中强调，这些模型专为边缘计算与终端设备场景打造，具备在笔记本电脑本地部署的能力，无需依赖云端架构支持。

在实际应用层面，该系列模型可流畅完成文档摘要、信息提取、分类标注、轻量检索增强生成（RAG）等中低复杂度任务。虽然不适用于高度复杂的运算场景，但能稳定处理生产环境中的实时工作负载。性能测试数据显示，其在通用知识、数学推理、代码生成与安全防护等领域的表现已超越部分同业竞品。

值得关注的是，所有模型均获得IBM负责AI开发的ISO 42001认证，并原生兼容vLLM、llama.cpp等主流推理框架。公司同时透露，Granite 4.0系列的更大规模模型正在训练中，预示着紧凑型AI技术路线将持续深化发展。

中文翻译：

由谷歌云赞助
如何选择首个生成式AI应用场景
开展生成式AI应用时，应优先关注能够优化人机信息交互体验的领域。

Granite 4.0 Nano系列专为边缘及端侧场景设计。IBM近日正式宣布推出迄今最小的人工智能模型Granite 4.0 Nano。此举彰显了IBM致力于证明模型规模并非智能程度决定性因素的理念——该系列参数量仅约10亿，远逊于OpenAI与谷歌等企业的产品。

在开发者社区Hugging Face上，IBM研究员凯特·索尔与拉梅斯瓦尔·潘达阐释了该系列面向边缘计算与终端设备的特性，强调这体现了"IBM持续开发高效实用模型的承诺，这些模型无需数千亿参数即可完成任务"。该产品线包含3.5亿与15亿参数两种规格，均采用混合状态空间架构（SSM）及Transformer变体。

IBM在Hugging Face平台发布的模型包含四组指导模型及其基础版本：

Granite 4.0 H 15亿参数：采用混合SSM架构的稠密大语言模型
Granite 4.0 H 3.5亿参数：采用混合SSM架构的稠密大语言模型
Granite 4.0 10亿参数：基于Transformer架构，适用于混合架构尚未优化的场景（如Llama.cpp）
Granite 4.0 3.5亿参数：基于Transformer架构

尽管IBM后续在Reddit澄清非混合架构的10亿参数版本实际接近20亿参数，但为保持命名体系一致性仍沿用现称。该系列原生兼容vLLM、llama.cpp和MLX框架，基于Apache 2.0开源协议，可供独立开发者与企业商用部署。

所有模型均支持本地运行（最小版本可在笔记本电脑部署），无需依赖云架构。IBM在Reddit补充说明："Nano系列专为边缘设备与延迟敏感场景研发，在文档摘要/提取、分类、轻量级RAG及工具调用等任务中表现优异。虽不适用于高度复杂任务，但能稳定处理生产环境中实时中等复杂度工作负载。"

该系列同时获得IBM负责任AI开发的ISO 42001认证。初步性能测试显示，其在通用知识、数学、编程与安全等测试维度表现优于阿里巴巴（千问）、LiquidAI（LFM）及谷歌（Gemma）的竞品。IBM宣称"在最小参数规模下实现了能力跨越式提升"，并透露正在训练Granite 4.0系列的更大规模模型。

英文来源：

Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Granite 4.0 Nano is designed for edge and on-device use cases.
IBM has confirmed the arrival of Granite 4.0 Nano, its smallest AI model yet.
The release constitutes IBM's latest effort to demonstrate that model size does not necessarily equate to greater intelligence and that sheer scale alone might not be a dominating factor. Granite 4.0 Nano reaches only about 1 billion parameters, dwarfed by offerings from the likes of OpenAI and Google.
Confirming the launch of Granite 4.0 Nano, IBM's Kate Soule and Rameswar Panda said on Hugging Face that they had been designed for the edge and on-device applications, adding that they represent "IBM's continued commitment to develop powerful, useful, models that don't require hundreds of billions of parameters to get the job done."
The family comprises models in two sizes: 350 million and about 1.5 billion parameters with hybrid state space architecture (SSM) as well as transformer variants.
IBM's line-up available on Hugging Face includes four instruct model and their base model counterparts. They are as follows:
Granite 4.0 H 1B (about 1.5 billion parameters): a dense large language model (LLM) featuring a hybrid-SSM based architecture.
Granite 4.0 H 350M(about 350 million parameters): a dense LLM featuring a hybrid-SSM based architecture.
Granite 4.0 1B: transformer-based variant, designed to enable workloads where hybrid architectures may not yet have optimized support, such as Llama.cpp.
Granite 4.0 350M: transformer-based variant, as with the 1B model.
Although IBM subsequently said on Reddit that the non-hybrid 1 billion variant is closer to 2 billion parameters, it "opted to keep the naming aligned to the hybrid variant to make the connection easily visible."
In effect, the models, which are natively compatible with vLLM, llama.cpp and MLX, will be usable by independent developers or enterprise, and are released under the Apache 2.0 license, which makes them suitable for commercial deployment.
The models can be run locally -- the smallest ones on a laptop -- and none of them rely on cloud architecture.
IBM added on Reddit: "We developed the Nano models specifically for the edge, on-device applications, and latency-sensitive use cases. Within that bucket, the models will perform well for tasks like document summarization/extraction, classification, lightweight RAG, and function/tool calling.
"While they aren't intended for highly complex tasks, they can comfortably handle real-time, moderate-complexity workloads in production environments."
They also come with IBM's ISO 42001 certification for responsible AI development.
IBM stated initial performance testing revealed the models show up strongly against competitors including Alibaba (Qwen), LiquidAI (LFM) and Google (Gemma).
The company claimed "a significant increase in capabilities that can be achieved with a minimal parameter footprint" across benchmarks encompassing General Knowledge, Math, Code and Safety domains."
And it seems there is more to come, with IBM admitting a larger model is being trained as part of the Granite 4.0 family.
You May Also Like

商业视角看AI

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读