Grok 4 – the latest reasoning model released by xAI

AI Tools updated 2d ago dongdong
18 0

What is Grok 4?

Grok 4 is the latest AI reasoning model developed by xAI. It offers 10x improved reasoning capabilities compared to its predecessor. Excelling in difficult exams such as the SAT and GRE, Grok 4 demonstrates near-perfect scores and outperforms many cutting-edge models across various benchmarks. It supports multimodal functions, understands subjective concepts, generates code and visual content, and introduces significant advancements in voice interaction. Grok 4 comes in two versions: the single-agent Grok 4, and Grok 4 Heavy, a multi-agent version that supports up to four agents working in parallel with a context window of up to 256k tokens.

Grok 4 – the latest reasoning model released by xAI


Grok 4 – Key Features

  • Exceptional Reasoning Abilities: Achieves near-perfect scores in exams like SAT and GRE, demonstrating superhuman reasoning performance.

  • Multimodal Understanding: Capable of interpreting subjective concepts, conducting image analysis, and performing complex visual searches.

  • Information Aggregation & Summarization: Gathers information from social media and other sources, extracts key events, and presents them chronologically.

  • Code & Visual Generation: Can generate complex animations and scientific simulations (e.g., black hole collisions) based on textual prompts.

  • Enhanced Voice Interaction: Supports five new voice profiles with smoother dialogue and more natural emotional expression.

  • Complex Task Handling: Excels in simulation and strategy-based tasks, showcasing strong planning and execution capabilities.

  • Multi-Agent Collaboration: The SuperGrok Heavy version allows parallel processing with multiple intelligent agents to solve complex problems.


Grok 4 – Test Performance

Official Benchmarks:

  • Humanity’s Last Exam: Features 2,500 interdisciplinary expert-level questions. Grok 4 Heavy scores 44.4% with tools, potentially up to 50.7% with optimization.

  • AIME25 (Math Competition): Grok 4 Heavy scores a perfect 100%, outperforming all competitors.

  • GPQA (Graduate-Level QA): Scores 88.9%, ahead of Gemini 2.5 Pro (86.4%) and Claude 4 Opus (79.6%).

  • HMMT25 (High School Math Competition): Scores 96.7%, far surpassing Gemini 2.5 Pro (82.5%).

  • USAMO25 (USA Math Olympiad): Scores 61.9%, significantly beating Gemini DeepThink (49.4%) and Gemini 2.5 Pro (34.5%).

  • ARC-AGI (Abstract Reasoning): Scores 15.9%, nearly doubling the previous commercial SOTA.

  • Vending-Bench (Simulation Business Task): Grok 4 generates a net profit of $4,694, far exceeding Claude Opus 4 ($2,077) and human players ($844).

Grok 4 – the latest reasoning model released by xAI

Third-Party Evaluation (Artificial Analysis):

  • AI Index Score: Grok 4 scores 73, ahead of OpenAI o3 (70), Gemini 2.5 Pro (70), Claude 4 Opus (64), and DeepSeek R1 0528 (68).

  • Coding & Math Indices: Grok 4 ranks first in both categories.

  • GPQA Diamond Score: Achieves a record-high 88%, surpassing Gemini 2.5 Pro (84%).

  • Humanity’s Last Exam Score: Reaches 24%, topping Gemini 2.5 Pro (21%).

  • Speed: Processes at 75 tokens/second—slower than o3 (188 t/s) and Gemini 2.5 Pro (142 t/s), but faster than Claude 4 Opus Thinking (66 t/s).

Grok 4 – the latest reasoning model released by xAI


Grok 4 – Pricing

Subscription Plans:

  • SuperGrok: $30/month or $300/year

  • SuperGrok Heavy: $300/month or $3,000/year

API Pricing:

  • Input: $3 per million tokens

  • Output: $15 per million tokens

Grok 4 – the latest reasoning model released by xAI


Grok 4 – Official Website


Grok 4 – Application Scenarios

  • Educational Tutoring: Offers personalized learning plans, answers complex academic questions, and enhances students’ understanding.

  • Scientific Research: Analyzes large datasets, predicts scientific trends, and aids in discovering new theories and technologies.

  • Business & Finance: Performs market analysis and forecasting to support strategic decisions and optimize operations.

  • Content Creation: Assists with idea generation and script writing in advertising, film, and gaming, enhancing creative productivity.

  • Intelligent Assistant: Functions as a multimodal voice assistant to help users manage daily tasks and improve life convenience.

© Copyright Notice

Related Posts

No comments yet...

none
No comments yet...