GELab-Zero is a GUI agent model open-sourced by StepAI (Jieyue Xingchen)
What is GELab-Zero?
GELab-Zero is an open-source GUI Agent model developed by StepFun, designed for automated interaction and task execution on mobile devices. It supports local deployment and can run a 4B model on consumer-grade hardware, ensuring low latency and strong privacy protection. GELab-Zero enables one-click multi-terminal deployment, automatically handling environment dependencies and device management. It supports distributed task orchestration and multi-modal agent modes, allowing flexible handling of complex tasks. Across multiple open-source benchmarks, GELab-Zero delivers outstanding performance—particularly on the AndroidDaily benchmark, where it achieves a static accuracy of 73.4%, significantly surpassing other models. It solves the fragmentation problem of the mobile ecosystem and provides universal compatibility without requiring app developers to perform additional adaptation. Enterprise users can directly reuse this infrastructure to rapidly integrate MCP capabilities into their products.

GELab-Zero — Key Features
Local Deployment
Runs entirely on local devices without relying on the cloud, ensuring privacy and low-latency responses.
Lightweight Inference
Optimized to run efficiently on consumer-grade hardware, balancing performance and resource usage.
One-Click Multi-Device Deployment
Includes a unified deployment pipeline that automatically handles environment dependencies and device management.
Distributed Task Orchestration
Supports distributing tasks across multiple devices and records interaction trajectories for observation and reproducibility.
Multi-Modal Agent Modes
Includes various operation modes such as ReAct closed-loop, multi-agent collaboration, and scheduled tasks.
High Performance
Achieves leading accuracy across benchmarks, especially on the AndroidDaily dataset.
Fragmentation-Resistant Compatibility
Offers strong general compatibility across diverse mobile app ecosystems, requiring no additional adaptation from app developers.
Enterprise-Ready Infrastructure
Enterprises can directly build on top of GELab-Zero to integrate GUI Agent capabilities into real business applications.
Open-Source Code and Infrastructure
Provides complete inference infrastructure and pretrained models for rapid deployment and execution.
Technical Principles of GELab-Zero
Local Deployment & Privacy Protection
Processes all data locally to avoid cloud interaction, ensuring privacy and fast response times.
Lightweight Model Design
A streamlined architecture enables efficient performance on consumer-grade hardware with reduced resource consumption.
Plug-and-Play Engineering Infrastructure
Offers a complete inference pipeline that automatically handles device connections, dependency installation, and permission setups.
Multi-Modal Interaction Capabilities
Supports advanced interaction patterns including ReAct loops, multi-agent collaboration, and scheduled workflows.
Dynamic Task Orchestration & Replay
Uses distributed task scheduling to assign tasks across devices and record interaction trajectories for review and reproduction.
Reinforcement Learning & Adaptive Reasoning
Uses reinforcement learning to dynamically adjust strategies based on environment feedback, improving efficiency and task success rates.
General GUI Understanding & Operation
Strong GUI comprehension enables the model to recognize and operate various mobile app interfaces without additional developer-side adaptation.
GELab-Zero Project Resources
-
Official Website: https://opengelab.github.io/
-
GitHub Repository:github.com/stepfun-ai/gelab-zero
-
HuggingFace Model Hub:huggingface.co/stepfun-ai/GELab-Zero-4B-preview
Use Cases for GELab-Zero
Mobile Device Task Automation
Automatically performs actions on mobile devices such as app control, information querying, and workflow execution.
Enterprise Application Integration
Enterprises can integrate GUI Agent capabilities to enhance automation and operational efficiency.
Complex Task Handling
Executes multi-step and condition-heavy workflows like online shopping, task sequences, and complex searches.
Personal & Home Assistant
Helps users complete everyday tasks such as movie recommendations or checking traffic information.
Education & Learning Assistance
Assists with tasks in educational apps—online courses, homework submission, and more.
Lifestyle Service Automation
Automates interactions in service apps such as food delivery, ride-hailing, and utility services.