ReleasesSource: DEEPMIND BLOGJune 9, 2026Importance: 4/5

Introducing Gemma 4 12B: a unified, encoder-free multimodal model

English summary

Google DeepMind released Gemma 4 12B, a 12-billion-parameter open multimodal model. The model handles text and images without a separate vision encoder through a unified architecture. It is part of the Gemma family of open models. The announcement highlights the encoder-free design but provides no further performance or capability details.

Chinese summary

Google DeepMind 发布了 Gemma 4 12B，这是一个 120 亿参数的开源多模态模型。该模型采用统一架构，无需独立视觉编码器即可处理文本和图像。它属于 Gemma 开源模型系列。公告强调了无编码器设计，但未提供更多性能或能力细节。

Key points

Gemma 4 12B is a 12-billion-parameter multimodal model from Google DeepMind.
Gemma 4 12B 是 Google DeepMind 提供的一个 120 亿参数多模态模型。
It uses an encoder-free architecture, processing text and images in a unified manner.
它采用无编码器架构，以统一方式处理文本和图像。
The model is released as part of the open-source Gemma model series.
该模型作为开源 Gemma 模型系列的一部分发布。

Open original