Kimi has open-sourced the audio foundational model Kimi-Audio, which has topped multiple benchmark tests
The Kimi team has released Kimi – Audio, an open – source foundational audio model. This model performs excellently in multiple tasks such as speech recognition, audio understanding, audio – to – text conversion, and voice dialogue. It has swept more than ten benchmark tests and ranked first in overall performance.
In the LibriSpeech Automatic Speech Recognition (ASR) test, the word error rate of Kimi – Audio is only 1.28%, which is significantly better than that of other models.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...