OpenReasoning-Nemotron: NVIDIA’s High-Performance Reasoning Model Family Surpassing o3 to Become the New Open-Source Benchmark

AI Tools updated 6d ago dongdong
14 0

What is OpenReasoning-Nemotron?

OpenReasoning-Nemotron is a series of high-performance large language models (LLMs) launched by NVIDIA, based on the Qwen 2.5 architecture. Originating from the DeepSeek R1 0528 671B model, these models are distilled from large-scale data and come in four parameter sizes: 1.5B, 7B, 14B, and 32B. They focus on reasoning tasks in mathematics, science, and programming. Available on the Hugging Face platform, these models aim to provide researchers with a stronger reasoning baseline and to advance technologies such as reinforcement learning for reasoning (RL4R).

OpenReasoning-Nemotron: NVIDIA’s High-Performance Reasoning Model Family Surpassing o3 to Become the New Open-Source Benchmark


Key Features and Advantages

  • High-Quality Distilled Dataset: Utilizing the DeepSeek R1 0528 model, NVIDIA generated 5 million high-quality reasoning samples covering mathematics, programming, and science domains. This dataset will be released in the coming months to further boost model reasoning capabilities.

  • Multiple Model Sizes: To accommodate different computing resources and research needs, OpenReasoning-Nemotron offers models with 1.5B, 7B, 14B, and 32B parameters, suitable for a range of applications from lightweight to high-performance.

  • Outstanding Reasoning Performance: The models demonstrate excellent results on various reasoning benchmarks. Particularly, models with 7B parameters and above show significant score improvements, achieving a top score of 78.2, surpassing existing models such as o3 to become the new leader in open-source models.

  • Robust Research Foundation: Training and evaluation utilize NVIDIA’s NeMo-Skills toolkit, ensuring high quality and reproducibility. Related code and data generation tools are also open-sourced to facilitate secondary development and customized training by researchers.


Technical Principles and Architecture

OpenReasoning-Nemotron models are based on the Qwen 2.5 architecture and leverage large-scale data distillation techniques to extract knowledge from the DeepSeek R1 0528 671B model, producing strong reasoning capabilities. The training process employs NVIDIA’s NeMo-Skills toolchain, which covers data generation, preprocessing, model conversion, training, and evaluation to guarantee model quality and reproducibility. Model evaluation uses the pass@1 metric to comprehensively measure performance across reasoning tasks.


Project Links


Application Scenarios

  • Mathematical Reasoning: The models excel at solving complex math problems, achieving outstanding results on benchmarks such as AIME24 and AIME25.

  • Scientific Reasoning: They understand and reason about scientific concepts, assisting researchers in scientific exploration.

  • Programming Assistance: The models comprehend code logic, offering programming suggestions and code generation to enhance development efficiency.

  • Reinforcement Learning Reasoning Research: Their high performance provides a strong baseline for advanced research in reinforcement learning for reasoning (RL4R), fostering progress in this field.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...