谷歌开源扩散语言模型 DiffusionGemma-26B-A4B-it,生成速度突破 500 tokens/秒
英文摘要
Google has released the diffusiongemma-26B-A4B-it model under an Apache 2.0 license, building on its earlier experimental Gemini Diffusion. It is openly available on Hugging Face and NVIDIA offers free access via their NIM cloud API, demonstrating over 500 tokens per second generation speed. In a test, the model generated 2,409 tokens in 4.4 seconds, highlighting its efficiency for text generation tasks.
中文摘要
谷歌以 Apache 2.0 许可证发布了 diffusiongemma-26B-A4B-it 模型,该模型基于先前实验性的 Gemini Diffusion。模型已在 Hugging Face 公开,NVIDIA 通过 NIM 云 API 免费提供访问,实测生成速度超过 500 tokens/秒。一次测试中生成了 2,409 tokens 仅耗时 4.4 秒,展示了其高效的文本生成能力。
关键要点
Google releases DiffusionGemma-26B-A4B-it, an open-weight diffusion language model under Apache 2.0 license.
谷歌发布 DiffusionGemma-26B-A4B-it,采用 Apache 2.0 许可证的开源扩散语言模型。
The model builds on Google's earlier experimental Gemini Diffusion, now made publicly available.
该模型基于谷歌此前实验性质的 Gemini Diffusion 开发,现已公开可用。
NVIDIA offers free access via NIM cloud API, with generation speed exceeding 500 tokens per second.
NVIDIA 通过 NIM 云 API 免费提供访问,生成速度超过 500 tokens/秒。
A test generated 2,409 tokens in 4.4 seconds, demonstrating high-speed text generation.
一次测试生成了 2,409 个 token 仅耗时 4.4 秒,展示了高速文本生成。