Kimi has open-sourced the audio foundational model Kimi-Audio, which has topped multiple benchmark tests

AI Daily News updated 7m ago dongdong

146 0

The Kimi team has released Kimi – Audio, an open – source foundational audio model. This model performs excellently in multiple tasks such as speech recognition, audio understanding, audio – to – text conversion, and voice dialogue. It has swept more than ten benchmark tests and ranked first in overall performance.

In the LibriSpeech Automatic Speech Recognition (ASR) test, the word error rate of Kimi – Audio is only 1.28%, which is significantly better than that of other models.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

The GPT-4o native image generation model, which is far superior to DALL·E, has finally opened its API to the public.

The GPT-4o native image generation model, which is far superior to DALL·E, has finally opened its API to the public.

7m ago

01920

Microsoft Research unveils Magnetic-UI, an open-source research prototype of human-centered AI agents

Microsoft Research unveils Magnetic-UI, an open-source research prototype of human-centered AI agents

6m ago

02020

xAI Launches Intelligent Code Generation Model Grok Code Fast 1

xAI Launches Intelligent Code Generation Model Grok Code Fast 1

2m ago

01080

The feature for exporting OpenAI in-depth research reports to PDF is now available

The feature for exporting OpenAI in-depth research reports to PDF is now available

6m ago

01810

No comments yet...

none

No comments yet...