OpenAI Releases GPT-realtime, a Dedicated Model for Voice AI Agents

AI Daily News updated 2m ago dongdong

63 0

OpenAI has released the speech model GPT-realtime, a multimodal model designed specifically for voice AI agents. It can generate natural and fluent speech, accurately mimicking human intonation, emotions, and pace, while also supporting image understanding combined with voice or text interactions. The update introduces two new voices—Marin and Cedar—and enhances the existing eight voices. Equipped with advanced intelligence, reasoning, and comprehension abilities, GPT-realtime can capture non-verbal cues, switch languages, and adjust tone seamlessly.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

The new AI model HRM surpasses large language models

The new AI model HRM surpasses large language models

3m ago

01280

The GPT-4o native image generation model, which is far superior to DALL·E, has finally opened its API to the public.

The GPT-4o native image generation model, which is far superior to DALL·E, has finally opened its API to the public.

6m ago

01100

The Genspark company has launched the innovative Genspark AI Browser

The Genspark company has launched the innovative Genspark AI Browser

5m ago

01110

OpenAI’s “Hotline” Feature Update: Send a text to 1-800-242-8478 to generate images

OpenAI’s “Hotline” Feature Update: Send a text to 1-800-242-8478 to generate images

4m ago

0840

No comments yet...

none

No comments yet...