Don’t be an Agent who can do everything.

Barry Zhang, an engineer at Anthropic, shared a presentation at the AI Engineer Workshop titled “How to Build Effective Agents.” One of the most memorable points was: “Don’t build agents for everything.” To put it another way, don’t aim to create an agent that can do everything—that’s what large models are for! The three key principles for building effective agents are:

1. Make wise choices about application scenarios; not all tasks require an Agent.
2. After identifying suitable use cases, keep the system as simple as possible for as long as possible.
3. During the iteration process, try to think from the Agent’s perspective, understand its limitations, and provide assistance.

Barry is mainly responsible for the Agentic System. His speech is based on a blog post he co-authored with Eric. Below is a detailed summary of their core views, as well as their reflections on the evolution and future of the Agent system.

The Evolution of Agent Systems

Simple Functions: Initially, there were simple tasks such as summarization, classification, and extraction, which seemed magical a few years ago but have now become foundational.
Workflows: As models and products matured, the orchestration of multiple model calls began, forming predefined control flows. This approach sacrifices cost and latency in exchange for better performance and is considered a precursor to Agent systems.
Agent: At the current stage, with stronger model capabilities, domain-specific Agents are starting to emerge. Unlike workflows, Agents can autonomously decide their action paths based on environmental feedback, operating almost independently.
Future (Speculation): The future may involve more generalized single Agents or multi-Agent collaboration. The trend is to grant systems more autonomy, making them more powerful and useful. However, this also comes with higher costs, latency, and potentially greater consequences of errors.

Core Viewpoint 1
Not all scenarios are suitable for building Agents (Don’t build agents for everything).

Agents are mainly used to expand complex and valuable tasks. However, they are costly and have high latency, so they should not be used as a direct upgrade for all use cases. For tasks that can clearly map out a decision tree, explicitly constructing workflows is more cost-effective and controllable.

• Checklist for When to Build an Agent:

Task Complexity: Agents excel at handling ambiguous problem spaces. If the decision path is clear, a workflow should be prioritized instead.
Task Value: Agent exploration consumes a significant number of tokens, so the value of the task must justify the cost. For budget-constrained (e.g., $0.10 per task) or high-volume (e.g., customer service) scenarios, workflows may be more suitable.
Feasibility of Key Capabilities: Ensure that the agent does not face critical bottlenecks in key areas (e.g., writing, debugging, and error recovery for coding agents). Otherwise, costs and latency will increase significantly. If bottlenecks exist, consider narrowing the scope of the task.
Error Cost and Detection Difficulty: If the cost of errors is high and errors are difficult to detect, it becomes challenging to trust the agent to act autonomously. Mitigation strategies, such as limiting the scope (e.g., read-only permissions or increased human intervention), can be applied, but these may also limit scalability.

Coding is a great use case for Agents because it involves complex tasks (ranging from design documents to PRs), delivers high value, and existing models (such as Claude) perform well in many stages. Moreover, the results are easy to verify, such as through unit testing and CI.

Core Viewpoint 2
Keep it Simple

The core structure of an Agent: Model + Tools + Loop operates in an Environment.

• Three key components:
1. Environment: The system in which the Agent operates;
2. Tool set: The interface through which the Agent takes actions and receives feedback;
3. System prompt: Defines the Agent’s goals, constraints, and desired behaviors.

• Iteration Method:
Prioritize building and iterating these three basic components to achieve the highest return on investment. Avoid over-complicating things from the start, as this can stifle iteration speed. Optimization (such as caching trajectories, parallelizing tool calls, and improving the user interface to enhance trust) should be carried out after the basic behaviors are established.

• Consistency: Although different Agent applications (coding, searching, computer usage) seem different in terms of product level, scope, and capabilities, they share almost the same simple backend architecture.

Core Viewpoint 3
Think like your agents

Question: Developers often start from their own perspectives and thus have difficulty understanding why Agents would make seemingly abnormal mistakes.

Solution: Place yourself within the “context window” of the Agent. The Agent makes decisions at each step based on limited contextual information (e.g., 10k-20k tokens).

Perspective-taking exercise: Try to complete the task from the perspective of the Agent and experience its limitations (e.g., it can only view static screenshots and operates “blindly” during reasoning and tool execution). This helps identify what information the Agent truly needs (such as screen resolution, recommended actions, and constraints) to avoid unnecessary exploration.

Leverage the model itself: You can directly ask the model (e.g., Claude): Is the instruction ambiguous? Does it understand the tool description? Why was a certain decision made? How can it be helped to make better decisions? This helps bridge the understanding gap between developers and Agents.

Personal thoughts

Budget-aware Agents: There is a need to better control the cost and latency of agents. Define and enforce budgets for time, money, and tokens to enable broader deployment in production environments.

Self-evolving Tools: Agents may be able to design and improve their own tools (meta-tools), making them more general-purpose and adaptable to the needs of different use cases.

Multi-agent Collaboration: It is expected that more multi-agent systems will be seen in production by the end of this year. Their advantages include parallelization, separation of concerns, protection of the main agent’s context window, etc. The key challenge lies in the communication methods between agents, specifically how to achieve asynchronous communication and go beyond the current user-assistant turn-taking model.