Label Studio is a free and open-source data annotation tool launched by Human Signal (formerly Heartex). This project has nearly 14,000 stars on GitHub and can help developers fine-tune large language models, prepare training data, or validate AI models.
The main features of Label Studio
- Support marking various types of data, including images, audio, text, time series, multi-domain, videos, etc.
- Flexible and configurable. Configurable layouts and templates can be combined with your own datasets and workflows.
- Machine learning-assisted marking. Predictions are used to assist the marking process through ML backend integration, thus saving time.
- Multiple projects and users. Support multiple projects, use cases, and data types on one platform.
- Integrate with your ML/AI pipeline. Identity authentication, project creation, task import, model prediction management, etc. can be performed using Webhooks, Python SDK, and API.
How to start using Label Studio?
- First, make sure that the
libq-dev
and python3-dev
dependencies are installed on your computer.
- Then, install Label Studio using the command
pip install label-studio
.
- Start Label Studio in the terminal/command line by using the command
label-studio start
.
- Open the Label Studio UI at
http://localhost:8080
.
- Register using your own created email address and password.
- Click on
Create
to create a project and start labeling data.
- Name the project, you can enter a project description and select a color.
- Click on
Data Import
and upload the data file you want to use. If you want to use data from a local directory, cloud storage, or a database, you can skip this step for now.
- Click on
Labeling Setup
, set up and select a template, and customize the annotation names according to your use case.
- Click on
Save
to save your project.