Skywork-SWE-32B – Kunlun Wanwei’s Open-Source Autonomous Code Intelligent Agent Base Model

What is Skywork-SWE-32B？

Skywork-SWE-32B is an open-source 32-billion-parameter autonomous code intelligent agent base model for software engineering (SWE) developed by Kunlun Wanwei. The model focuses on software engineering tasks, especially repository-level code repair capabilities, and performs excellently in complex scenarios involving multi-turn interactions and long-text processing. By constructing over 10,000 verifiable GitHub repository task instances, it has built the largest verifiable GitHub repository-level code repair dataset to date. On the SWE-bench Verified benchmark, it achieved a pass@1 accuracy of 38.0%, setting a new state-of-the-art among models of the same parameter scale. After introducing test-time scaling techniques, accuracy was further improved to 47.0%, significantly surpassing existing open-source models below 32B parameters and approaching or even exceeding the performance of some closed-source models.

Main Features of Skywork-SWE-32B

Repository-level Code Repair: Capable of locating code issues (such as bugs) within GitHub repositories, generating repair code, verifying the repair effects, and completing the full closed-loop from problem understanding to solution.
Multi-turn Interaction Capability: Supports more than 50 rounds of interaction, simulating multiple debugging and repair cycles in real development scenarios to gradually resolve issues.
Long Text Processing: Able to handle long texts exceeding 32k tokens, meeting the needs of complex code files and multi-file dependencies.
Automated Verification: Ensures the generated repair code is effective in real runtime environments by constructing dedicated runtime environments and unit test verification mechanisms.

Technical Principles of Skywork-SWE-32B

Large-scale Dataset Construction

Automated Data Collection and Verification: Using a three-stage automated process (data collection and pre-filtering, execution-based verification, agent trajectory generation), a dataset containing 10,169 real Python task instances covering 2,531 different GitHub repositories was constructed.
Runtime Environment Support: Each task instance is equipped with a dedicated Docker runtime environment image to support automated unit test verification, ensuring the validity of generated repair code in actual runtime.
High-quality Training Trajectories: Generated high-quality training samples for model fine-tuning by leveraging multi-turn interaction trajectories from the agent’s task-solving process.

Model Training and Optimization

Based on the OpenHands Framework: Utilizes the OpenHands code agent framework supporting multi-turn interaction and long-text processing, simulating real development scenarios of code repair.
Data Scaling Law: Systematic validation showed continuous model performance improvement as training data scale increased, verifying the applicability of the data scaling law in software engineering tasks.
Test-Time Scaling (TTS) Technology: During inference, increasing the number of independent rollouts (e.g., N=8) further enhances model performance, fully utilizing the model’s inference capability.

Project Links of Skywork-SWE-32B

HuggingFace Model Hub: https://huggingface.co/Skywork/Skywork-SWE-32B
Technical Paper: https://huggingface.co/Skywork/Skywork-SWE-32B/resolve/main/assets/Report.pdf

Application Scenarios of Skywork-SWE-32B

Code Quality Optimization: The model can analyze potential problems in code and provide optimization suggestions to help developers improve code quality and maintainability.
Unit Test Automation: By building dedicated runtime environments and unit test verification mechanisms, Skywork-SWE-32B can automatically execute test cases to verify the effectiveness of generated repair code.
Teaching Assistance: In software engineering and programming courses, Skywork-SWE-32B can serve as a teaching tool to help students understand the process of solving code issues and enhance programming skills.
Research Support: Provides researchers with a powerful experimental platform to explore the application of large language models in software engineering tasks and verify theories such as the data scaling law.
Internal Development Tools: Enterprises can integrate Skywork-SWE-32B into internal development tools to automate handling code issues, reduce manual intervention, and improve development efficiency and code quality.

Skywork-SWE-32B – Kunlun Wanwei’s Open-Source Autonomous Code Intelligent Agent Base Model

What is Skywork-SWE-32B？

Main Features of Skywork-SWE-32B

Technical Principles of Skywork-SWE-32B

Project Links of Skywork-SWE-32B

Application Scenarios of Skywork-SWE-32B

FactSnap – An AI browser extension developed by Studio NAND

DeepSite V2 – An AI-powered web page generation tool that enables precise editing of webpage elements

Related Posts

DreamO – An image customization generation framework jointly launched by ByteDance and Peking University

Day.ai – An AI-native CRM tool that automatically extracts customer information to create a CRM system

LilysAI – An AI summarization tool that supports multiple file types, including audio, video, PDFs, and webpages

Macaly – A no – code AI application development tool that allows you to create applications through natural language descriptions

No comments yet...