OpenAI Releases GPT-realtime, a Dedicated Model for Voice AI Agents
OpenAI has released the speech model GPT-realtime, a multimodal model designed specifically for voice AI agents. It can generate natural and fluent speech, accurately mimicking human intonation, emotions, and pace, while also supporting image understanding combined with voice or text interactions. The update introduces two new voices—Marin and Cedar—and enhances the existing eight voices. Equipped with advanced intelligence, reasoning, and comprehension abilities, GPT-realtime can capture non-verbal cues, switch languages, and adjust tone seamlessly.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...