Mistral Releases Its First Open-Source Speech Model, Voxtral — A Complete Game-Changer Surpassing Whisper
Mistral AI has released its first open-source speech model, Voxtral, available in 24B and 3B parameter versions. Open-sourced under the Apache 2.0 license, Voxtral also provides API access. It supports eight major languages and can handle 30-minute audio transcription or 40-minute semantic understanding tasks. Outperforming Whisper across the board, Voxtral excels in multilingual benchmark tests, ranks first in speech translation, and matches GPT-4o-mini in speech understanding capabilities.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...