TutorialsSource: MARKTECHPOSTJune 10, 2026Importance: 4/5

Google Releases Gemini 3.5 Live Translate, a Streaming Speech-to-Speech Audio Model Covering 70+ Languages Across Meet, Translate, and the Live API

English summary

Google announced Gemini 3.5 Live Translate, a dedicated speech-to-speech audio model that continuously translates spoken audio into 70+ languages while preserving the speaker's intonation and pacing. Unlike turn-based agents, it processes audio as a stream, producing translated speech a few seconds behind the speaker. Developers can configure it via the Gemini Live API using a translationConfig with a BCP-47 target language code; the model accepts only raw 16-bit 16kHz PCM audio input and outputs 24kHz audio. It is rolling out in public preview on the Live API and Google AI Studio, a private preview in Google Meet (expanding from 5 to 70+ languages), and will launch in the Google Translate app on Android and iOS. All generated audio is watermarked with SynthID for detectability.

Chinese summary

谷歌发布了 Gemini 3.5 Live Translate，一个专用的语音到语音音频模型，能实时将口语翻译成 70 多种语言，并保留说话人的语调、语速和音高。它采用连续流处理，翻译延迟仅几秒，不同于基于轮次的交互模式。开发者可通过 Gemini Live API 配置 translationConfig，指定 BCP-47 目标语言代码；输入为 16kHz 16-bit 单声道 PCM 音频，输出 24kHz 音频。该模型已在 Live API 和 AI Studio 上开放公开预览，Google Meet 正进行企业私有预览（语言支持从 5 种提升至 70 种以上），并将登陆 Android 和 iOS 版 Google 翻译应用。所有生成音频均嵌入不可察觉的 SynthID 水印。

Key points

New streaming translation model: Gemini 3.5 Live Translate (gemini-3.5-live-translate-preview) is a single audio model, not a chat assistant, optimized for continuous speech-to-speech translation.
新流式翻译模型：Gemini 3.5 Live Translate（gemini-3.5-live-translate-preview）是专用音频模型，优化了连续语音到语音翻译，非聊天助手。
Supports 70+ languages automatically, with generated speech mirroring the speaker's prosody, staying a few seconds behind in real time.
自动支持 70 多种语言，生成语音保留说话人的韵律特征，实时翻译延迟仅几秒。
Available through Gemini Live API (with translationConfig block), Google Meet private preview (expanding from 5 to 70+ languages), and Google Translate app.
可通过 Gemini Live API（使用 translationConfig 代码块）使用，Google Meet 提供私有预览（从 5 种语言扩展到 70 种以上），并将登陆翻译应用。
Technical constraints: audio only, 16kHz PCM input, 24kHz output, no text inputs, tool use, or system instructions in translation mode.
技术限制：仅限音频、16kHz PCM 输入、24kHz 输出，翻译模式下不支持文本输入、工具使用或系统指令。
All output audio carries an imperceptible SynthID watermark; integration partners include Agora, LiveKit, and Pipecat.
所有输出音频嵌入 SynthID 数字水印；集成合作伙伴包括 Agora、LiveKit、Pipecat 等。

Open original