Xiaomi MiMo – Xiaomi’s first open-source inference large model

AI Tools updated 6h ago dongdong
1 0

What is Xiaomi MiMo?

Xiaomi MiMo is Xiaomi’s first open-source reasoning large language model (LLM), designed to enhance performance in complex reasoning tasks. The model adopts a joint pretraining and post-training approach, leveraging large-scale reasoning-rich corpora and innovative reinforcement learning algorithms to significantly boost its capabilities in mathematical reasoning and code generation.

Despite having only 7 billion parameters, MiMo surpasses larger models such as OpenAI’s o1-mini and Alibaba’s Qwen QwQ-32B-Preview on public evaluation benchmarks.

Xiaomi MiMo includes four model variants, all open-sourced on Hugging Face:

  • MiMo-7B-Base (Pretrained model)

  • MiMo-7B-SFT (Supervised fine-tuned model)

  • MiMo-7B-RL (Reinforcement learning model)

  • MiMo-7B-RL-Zero (Zero-shot reinforcement learning model)

These models provide developers with powerful tools for advanced reasoning tasks.

Xiaomi MiMo – Xiaomi's first open-source inference large model


Key Features of Xiaomi MiMo

  • Powerful Mathematical Reasoning
    Solves complex math problems with accurate reasoning paths and solutions.

  • Efficient Code Generation
    Produces high-quality code suitable for various programming tasks.

  • Optimized Reasoning Performance
    Achieves high reasoning efficiency through joint pretraining and post-training, outperforming larger models with just 7B parameters.


Technical Principles of Xiaomi MiMo

  • Pretraining Stage

    • Focuses on mining reasoning-rich corpora.

    • Synthesizes around 200 billion tokens of reasoning data.

    • Trains using a three-phase curriculum learning strategy with a total of 25 trillion tokens, gradually increasing task difficulty to improve model capability.

  • Post-Training Stage

    • Reinforcement Learning Algorithm: Introduces the Test Difficulty Driven Reward algorithm to mitigate sparse reward issues in hard tasks, improving performance on complex problems.

    • Data Resampling Strategy: Implements an Easy Data Re-Sampling strategy to stabilize the RL training process.

    • Efficient Training Framework: Designs a Seamless Rollout system to accelerate RL training (by 2.29×) and evaluation (by 1.96×), improving training efficiency.

  • Model Architecture Optimization
    The model is tailored for reasoning tasks, ensuring high performance with a compact parameter size.


Xiaomi MiMo Project Resources


Application Scenarios of Xiaomi MiMo

  • Education: Assists with math problem-solving and programming learning, offering solution steps and code examples.

  • Scientific Research: Supports logical reasoning and algorithm development, helping verify hypotheses and design experiments.

  • Software Development: Generates and optimizes code, assists with debugging and problem-solving.

  • Intelligent Customer Service: Answers complex queries and improves the efficiency of Q&A systems.

  • Gaming and Entertainment: Provides strategy suggestions and puzzle-solving, enhancing the fun of games.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...