AgentPrune – A Multi-Agent Communication Optimization Framework Launched by Tongji University Jointly with the Chinese University of Hong Kong and Other Institutions

What is AgentPrune?

AgentPrune is a communication optimization framework for large language model (LLM)-driven multi-agent systems, jointly proposed by institutions such as Tongji University and The Chinese University of Hong Kong. Leveraging a “pruning” technique, it eliminates redundant or harmful communication content, reducing communication costs and enhancing system performance. AgentPrune models the multi-agent system as a spatiotemporal graph, optimizes communication connections through low-rank sparse graph masking, and generates an efficient communication topology with a one-time pruning process. AgentPrune demonstrates outstanding performance in multiple benchmark tests, achieving comparable performance with only 5.6% of the cost of traditional methods. It can be seamlessly integrated into existing multi-agent frameworks, such as AutoGen and GPTSwarm, achieving a 28.1% to 72.8% reduction in tokens.

The main functions of AgentPrune

Communication Redundancy Identification and Pruning: AgentPrune is the first to identify and define the problem of communication redundancy in LLM multi-agent systems. Through a one-time pruning technique, it eliminates redundant and harmful communication content.
Spatiotemporal Graph Modeling and Optimization: The multi-agent system is modeled as a spatiotemporal graph, which includes spatial edges (communication within the same round of dialogue) and temporal edges (communication across multiple rounds of dialogue), optimized via a parameterized graph mask.
Application of Low-Rank Sparse Graph Masks: Based on low-rank sparse graph masks, the communication structure is encouraged to become sparser, reducing redundancy, noise, and malicious messages, while enhancing robustness against network attacks.
Cost and Performance Optimization: In multiple benchmark tests, AgentPrune achieves effects comparable to the state-of-the-art topology at a significantly lower cost (only $6 compared to $43.7 for other systems). It can be seamlessly integrated into existing multi-agent frameworks, achieving a 28.1% to 72.8% reduction in tokens.
Defense Against Adversarial Attacks: AgentPrune successfully defends against two types of agent adversarial attacks, resulting in a performance improvement of 3.5% to 10.8%.

The Technical Principle of AgentPrune

Spatiotemporal Graph Modeling: AgentPrune models the communication structure of a multi-agent system as a spatiotemporal graph, where nodes represent agents, and edges represent communication connections. The edges are divided into spatial edges (communication within the same round of dialogue) and temporal edges (communication across different rounds of dialogue).
Parameterized Graph Mask: AgentPrune optimizes communication connections through a parameterized graph mask. The goal of the graph mask is to reflect the importance of communication connections via distribution approximation and low-rank sparsity. Distribution approximation maximizes the system’s utility through policy gradient methods while minimizing communication redundancy; low-rank sparsity, on the other hand, enforces a sparser communication structure through low-rank constraints, eliminating redundancy, noise, and even malicious messages.
One-time Pruning: During the early stages of training, Agent performs a limited number of optimizations on the graph mask and then eliminates unimportant communication connections through one-time pruning. Specifically, AgentPrune selects a certain proportion of the most important connections based on the size of the graph mask to generate a sparse communication graph.
Optimized Communication Graph: In subsequent communication processes, the multi-agent system strictly follows this optimized communication graph for message passing, reducing communication costs while maintaining high efficiency.

The project address of AgentPrune

Github Repository: https://github.com/yanweiyue/AgentPrune
arXiv Technical Paper: https://arxiv.org/pdf/2410.02506

Application scenarios of AgentPrune

Multi-Agent System Optimization: AgentPrune can be seamlessly integrated into existing multi-agent frameworks, such as AutoGen and GPTSwarm, significantly reducing communication costs while maintaining or enhancing system performance.
Cost-Effective Communication Topology: In multi-agent systems, AgentPrune generates a sparse communication topology through a one-time pruning technique, significantly reducing token consumption.
Complex Task Collaboration: AgentPrune is suitable for complex tasks that require collaboration among multiple agents, such as mathematical reasoning, code generation, and commonsense question answering. By optimizing the communication structure, AgentPrune improves task completion efficiency and reduces economic costs.
Industrial and Enterprise-Level Applications: In industrial automation and enterprise-level applications, AgentPrune can optimize communication between agents, reduce resource waste, the overall system efficiency.