Xiaomi open-sources its first native end-to-end speech large model, Xiaomi-MiMo-Audio

AI Daily News updated 2m ago dongdong

88 0

Xiaomi has open-sourced its first native end-to-end speech large model, Xiaomi-MiMo-Audio. Built on an innovative pre-training architecture and trained with billions of hours of data, the model achieves, for the first time in the speech domain, in-context learning (ICL)-based few-shot generalization, demonstrating cross-modal alignment capabilities. In multiple benchmark evaluations, Xiaomi-MiMo-Audio outperforms open-source models with comparable parameter sizes as well as closed-source models from Google and OpenAI.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

Google’s New World Model Dreamer 4 Trains Purely Through “Imagination”

Google’s New World Model Dreamer 4 Trains Purely Through “Imagination”

1m ago

01030

Sam Altman announces that the text-to-image generation feature of ChatGPT is now available to all free users.

Sam Altman announces that the text-to-image generation feature of ChatGPT is now available to all free users.

7m ago

01900

The Gemini 2.5 Pro API has been taken offline.

The Gemini 2.5 Pro API has been taken offline.

6m ago

01760

The Trust Crisis between American Universities and ChatGPT

The Trust Crisis between American Universities and ChatGPT

6m ago

01460

No comments yet...

none

No comments yet...