OpenAI has launched three major audio models, taking speech interaction technology to a new level.

AI Daily News updated 7m ago admin

196 0

OpenAI has launched a new generation of audio models, featuring speech-to-text and text-to-speech capabilities. gpt-4o-transcribe significantly reduces word error rates, outperforming the existing Whisper model. gpt-4o-mini-transcribe is a streamlined version that offers faster speed and higher efficiency. gpt-4o-mini-tts introduces “guidability” for the first time, allowing developers to control the voice style.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

The Google T5Gemma model has been released

The Google T5Gemma model has been released

4m ago

01240

Google Veo 3 launches API access, but at a steep price

Google Veo 3 launches API access, but at a steep price

5m ago

01220

OpenAI has upgraded the search function of ChatGPT and added a shopping feature

OpenAI has upgraded the search function of ChatGPT and added a shopping feature

6m ago

01400

The world’s first A-share financial game-theoretic AI agent application, FinGenius, has been open-sourced

The world’s first A-share financial game-theoretic AI agent application, FinGenius, has been open-sourced

4m ago

01270

No comments yet...

none

No comments yet...