WebShaper – an AI training data synthesis system developed by Alibaba Tongyi

What is WebShaper？

WebShaper is an innovative AI training data synthesis system developed by Alibaba’s Tongyi Lab. It leverages formalized modeling and an agent-based expansion mechanism to generate high-quality, scalable datasets for training AI agents. WebShaper introduces the novel concept of Knowledge Projection (KP)—a set-theoretic approach that uses operations like intersection, union, and recursion to construct complex problem structures, enabling fine-grained control over reasoning paths and task difficulty. With its built-in Expander Agent, WebShaper can grow simple “seed questions” into elaborate reasoning tasks, essentially enabling AI to “generate its own training problems.” Combined with supervised fine-tuning (SFT) and GRPO reinforcement learning, WebShaper significantly improves model performance in complex information retrieval tasks.

Key Features of WebShaper

Formalized Modeling
WebShaper is the first to propose a formal modeling method for information-seeking (IS) tasks based on set theory. Using Knowledge Projection (KP), complex tasks are broken down into a series of set operations—such as union, intersection, and recursion. Each KP is a set of entities defined by specific relationships. These operations allow for the precise construction of complex reasoning structures and task difficulties.

Agent-Based Expansion Mechanism
A key innovation of WebShaper is its ability to let AI “design its own tasks.” Through the Expander Agent, the system starts from a simple seed question and gradually expands it into a more complex reasoning task. The agent uses tools like search, summarization, and verification to build logical, multi-step problems while ensuring the correctness of answers and clarity of reasoning chains.

High-Quality Data Generation
Thanks to formal modeling and agent-based task generation, WebShaper produces controllable, interpretable, and scalable training data—not random or guessed examples. This breaks through the limitations of pre-retrieved data, enabling broader task diversity, skill stimulation, and knowledge coverage, while minimizing errors and noise in synthetic data.

Training Strategy for AI Agents
WebShaper uses a hybrid training approach combining Supervised Fine-Tuning (SFT) and GRPO reinforcement learning. This allows agents to progressively learn to reason and retrieve information across vague or multi-hop scenarios. By starting with high-quality learning trajectories and reward-guided training, the model avoids shortcut learning or guesswork.

Technical Principles of WebShaper

Formalism-Driven Framework
WebShaper applies set theory to formalize information-seeking tasks. The core concept is Knowledge Projection (KP)—a set of entities defined by specific relationships.

Knowledge Projection Operations:

R-Union: Handles uncertain conditions (e.g., “players who participated between 2000–2010”).
Intersection: Models multi-constraint queries (e.g., “players who played in 2000 and were born in the 1990s”).

Task Expansion Mechanism
Starting from a “seed task”, the Expander Agent uses retrieval and verification tools to iteratively expand the problem’s complexity within the formal framework—ensuring logical coherence and challenging reasoning chains.

Data Synthesis and Training
The expanded problems are transformed into training data and used for SFT and reinforcement learning (e.g., via GRPO algorithms), significantly boosting the model’s reasoning abilities in complex IS tasks.

Project Resources

GitHub Repository: https://github.com/Alibaba-NLP/WebAgent
HuggingFace Dataset: https://huggingface.co/datasets/Alibaba-NLP/WebShaper
arXiv Technical Paper: https://arxiv.org/pdf/2507.15061

Application Scenarios

Literature Review & Analysis
Helps researchers quickly gather and organize relevant papers, enabling interdisciplinary knowledge discovery.

Market Research
Supports automated industry data collection, trend analysis, and competitive intelligence for market analysts and investors.

Intelligent Learning Assistant
Acts as a smart assistant for students, supporting in-depth learning and research-based education.

Everyday Decision-Making
Useful for trip planning, health inquiries, and other life decisions, offering personalized and context-aware information retrieval.

Medical Information Retrieval
Assists users in searching for healthcare information, providing expert-level insights and health recommendations.