AI.news
主页教程研究工具模型AI创业讨论新闻每日简报WIKI🚀 创业库★ 投稿
AI+医疗机器人教育金融能源健康娱乐思考

OpenAI WebRTC Audio Session, now with document context

12th June 2026 - Link Blog

OpenAI WebRTC Audio Session, now with document context. I built the first version of this tool in December 2024 to try out the then-new OpenAI WebRTC API for interacting with their realtime audio models.

Last month OpenAI introduced a brand new model to that API called GPT‑Realtime‑2, which they promoted as "our first voice model with GPT‑5‑class reasoning" - with a Sep 30, 2024 knowledge cut-off.

I've been waiting for that model to show up in the ChatGPT iPhone app but it still hasn't, so I revisited my old playground.

You can now pick the better model, and you can also paste in a big chunk of document context so you can have as audio conversation in your browser about whatever information you think would be useful to explore in a conversational way.

Screenshot of a web interface titled "OpenAI WebRTC Audio Session" with a gray status dot. Form fields: "OpenAI API Token" showing a masked password of dots, "Voice" dropdown set to "Coral", "Model" dropdown set to "gpt-realtime-2". A collapsible section labeled "▼ Document context (optional — paste text to talk about)" with bold instruction "Paste a document here before starting the session and the model will be able to discuss it with you" above a textarea containing a pasted Markdown document about whether DuckDB can run untrusted SQL as safely as Datasette runs SQLite. Below are a blue "Start Session" button and a gray disabled "Mute Mic" button, then a green success message "Session established successfully!" At the bottom, a dark panel headed "Last transcript" reads: "DuckDB can be made about as safe as SQLite for running untrusted SELECT queries, but only if you lock it down properly. Using read only true by itself is not enough, because SQL can still" (text cut off).