OpenAI WebRTC Audio Session adds GPT-Realtime-2 model and document context input
English summary
Simon Willison's browser-based audio conversation tool, originally built in December 2024 to test the OpenAI WebRTC realtime audio API, has been updated. It now supports the GPT‑Realtime‑2 model, which OpenAI promotes as its first voice model with GPT‑5-class reasoning and a knowledge cutoff of September 30, 2024. A new feature allows users to paste document context, enabling interactive voice Q&A about the provided content. The update makes the newer model available for experimentation while the model has not yet appeared in the ChatGPT iPhone app.
Chinese summary
Simon Willison 基于浏览器的音频对话工具最初于 2024 年 12 月构建,用于测试 OpenAI WebRTC 实时音频 API,现已更新。现在支持 GPT‑Realtime‑2 模型,该模型被 OpenAI 宣传为首个具备 GPT‑5 级别推理能力的语音模型,知识截止日期为 2024 年 9 月 30 日。新增功能允许用户粘贴文档上下文,从而针对提供的内容进行交互式语音问答。此次更新在该模型尚未出现在 ChatGPT iPhone 应用之际,让用户能够实验该新模型。
Key points
The tool was first built in December 2024 to experiment with OpenAI's realtime WebRTC API.
该工具最初于 2024 年 12 月构建,用于实验 OpenAI 的实时 WebRTC API。
GPT-Realtime-2 model, with GPT-5-class reasoning and a Sep 30, 2024 knowledge cutoff, is now selectable.
现在可以选择 GPT‑Realtime‑2 模型,该模型具备 GPT‑5 级别推理能力,知识截止至 2024 年 9 月 30 日。
Users can paste document context to have an interactive audio conversation about that content.
用户可以粘贴文档上下文,从而针对该内容进行交互式音频对话。