DeepSeek, in collaboration with Tsinghua University, has released the DeepSeek-GRM model, which significantly improves scalability during inference.

AI Daily News updated 7m ago dongdong

164 0

DeepSeek, in collaboration with Tsinghua University, has released the DeepSeek-GRM model. This model employs a point-based generative reward modeling (GRM) approach and utilizes the “Self-Principled Critique Tuning” (SPCT) learning method, enabling the model to exhibit scalability during inference. Experiments demonstrate that the DeepSeek-GRM-27B, when scaled to 32 samples during inference, achieves performance comparable to that of a 671B parameter model, highlighting the significant advantages of its reasoning scalability.

© Copyright Notice

The copyright of the article belongs to the author. Please do not reprint without permission.

Related Posts

OpenAI adds shopping feature to ChatGPT

OpenAI adds shopping feature to ChatGPT

6m ago

01500

The international version of Trae IDE officially supports Gemini 2.5 Pro and GPT – 4.1.

The international version of Trae IDE officially supports Gemini 2.5 Pro and GPT – 4.1.

7m ago

02430

Mistral Unveils Medium 3 Large Model: High Cost-Effectiveness Meets Easy Deployment

Mistral Unveils Medium 3 Large Model: High Cost-Effectiveness Meets Easy Deployment

6m ago

01270

Generating Ghosts: AI’s Exploration of Digital Immortality

Generating Ghosts: AI’s Exploration of Digital Immortality

7m ago

01560

No comments yet...

none

No comments yet...