Xiaomi open-sources its first native end-to-end speech large model, Xiaomi-MiMo-Audio

AI Daily News updated 7d ago dongdong
20 0

Xiaomi has open-sourced its first native end-to-end speech large model, Xiaomi-MiMo-Audio. Built on an innovative pre-training architecture and trained with billions of hours of data, the model achieves, for the first time in the speech domain, in-context learning (ICL)-based few-shot generalization, demonstrating cross-modal alignment capabilities. In multiple benchmark evaluations, Xiaomi-MiMo-Audio outperforms open-source models with comparable parameter sizes as well as closed-source models from Google and OpenAI.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...