Kimina-Prover – A mathematical theorem-proving model jointly launched by moonshot AI and Numina

What is Kimina-Prover?

Kimina-Prover is a large-scale mathematical theorem-proving model jointly developed by moonshot AI and Numina. Trained with massive reinforcement learning, it mimics human-like reasoning to rigorously prove theorems in Lean 4. Through its unique formalized reasoning framework, the model interleaves informal reasoning with Lean 4 code snippets during the proof process, simulating human problem-solving strategies. Kimina-Prover achieved an 80.7% success rate on the miniF2F benchmark, surpassing the previous state-of-the-art by 10.6% and setting a new record. Its performance scales significantly with increased model size and computational resources, demonstrating high sample efficiency and strong scalability. The 1.5B and 7B parameter versions of the model have been open-sourced.

The main functions of Kimina-Prover

Based on Reinforcement Learning: Kimina-Prover is the first large-scale formal reasoning model trained through extensive reinforcement learning. It can reason in a human-like manner and rigorously prove mathematical theorems in the Lean 4 language.
Efficient Reasoning Mode: The model adopts a structured reasoning approach called “formal reasoning mode,” which intersperses informal reasoning and relevant Lean 4 code snippets during the reasoning process. This enables the model to better simulate human problem-solving strategies.
High Sample Efficiency: Kimina-Prover achieves good results with fewer sampling attempts, and its performance significantly improves as computational resources increase.
Model Size and Performance Correlation: Unlike previous neural theorem provers, Kimina-Prover’s performance improves significantly as the model size increases.

The Technical Principle of Kimina-Prover

Automatic Formalization: To construct a diverse problem set, researchers trained a model to automatically translate natural language problem statements into Lean 4 code, ending with placeholder proofs.
Reinforcement Learning Training: After the supervised fine-tuning (SFT) phase, the model further enhances its ability to formalize theorem proofs through reinforcement learning. In each iteration, the model samples a batch of problems from the problem set and generates multiple candidate solutions, which are then validated for correctness using the Lean compiler.

The Performance of Kimina-Prover

Benchmark Test Results: In the miniF2F benchmark test, Kimina-Prover achieved a score of 80.7%, surpassing the previous state-of-the-art (SOTA) model by 10.6% and setting a new record.
Comparison with General-Purpose Large Models: In the miniF2F benchmark test and its subsets (such as IMO and AIME), Kimina-Prover significantly outperformed general-purpose reasoning models like OpenAI’s o3 and Gemini 2.5 Pro.

The project address of Kimina-Prover

Github Repository: https://github.com/MoonshotAI/Kimina-Prover-Preview/tree/master
Hugging Face Model Hub: https://huggingface.co/collections/AI-MO/kimina-prover-preview
arXiv Technical Paper: https://arxiv.org/pdf/2504.11354

Application scenarios of Kimina-Prover

Scientific Research Assistance: Kimina-Prover has tremendous application potential in the field of mathematical research. It can help mathematicians and researchers quickly verify complex mathematical theorems and provide rigorous proof processes.
Software Testing: During the software development process, Kimina-Prover can be used to verify the logical correctness of software. By converting the algorithms and logic of the software into the form of mathematical theorems, the model can verify the correctness of these theorems, ensuring the reliability and stability of the software.
Algorithm Verification: In the fields of artificial intelligence and machine learning, Kimina-Prover can be used to verify the correctness and reliability of algorithms, ensuring their theoretical correctness.
Risk Assessment: In the financial field, Kimina-Prover can be used to verify the mathematical foundation of risk assessment models, ensuring the accuracy and reliability of these models.
Engineering Design Verification: In engineering design, Kimina-Prover can be used to verify the mathematical models and formulas of designs. In fields such as architectural structure design and mechanical design, the model can verify the stability and safety of the designs.