UniRig – A General Automatic Skeleton Rigging Framework Open-Sourced by Tsinghua in Collaboration with VAST

What is UniRig?

UniRig is an innovative automatic skeletal rigging framework jointly introduced by the Department of Computer Science at Tsinghua University and VAST. It is designed to handle complex and diverse 3D models. Leveraging a large autoregressive model and a skeletal point cross-attention mechanism, it generates high-quality skeletal structures and skinning weights. The framework incorporates-XL dataset, which contains over 14,000 3D models across various categories, for training and evaluation purposes. UniRig significantly outperforms existing academic and commercial methods in terms of skeletal rigging accuracy and motion accuracy. It supports seamless application across a wide range of object categories, from anime characters to complex organic and inorganic structures, greatly enhancing the efficiency of animation production.

The main functions of UniRig

Automatic Skeleton Generation: Generate a topology-correct skeleton tree for various 3D models (such as humans, animals, fictional characters, etc.).
Skin Weight Prediction: Predict the influence weights of each bone on the mesh vertices to ensure natural deformation of the mesh under skeletal animation.
Support for Diverse Models: Compatible with various types of 3D models, including complex organic and inorganic structures.
Efficient Animation Production: Improve the efficiency of animation production by reducing manual operations and workload.
Dynamic Effect Support: Generate skeletal attributes that support physics simulation (e.g., spring bones).

The Technical Principle of UniRig

Skeleton Tree Tokenization: Convert the skeleton tree structure into serialized tokens to facilitate efficient processing by autoregressive models. Use special tokens (e.g., `<type>`) to represent bone types (e.g., spring bones, template bones), and employ a Depth-First Search (DFS) algorithm to extract linear bone chains, compactly representing the skeleton structure. Skeleton tree tokenization reduces sequence length and improves the training and inference efficiency of the model.
Autoregressive Model: The autoregressive model based on Transformer (e.g., Skeleton Tree GPT) predicts skeletal trees. The model generates tokens one by one to construct the skeletal tree, ensuring that the generated skeletal structure is topologically valid. The inputs to the model include a point cloud sampled from a 3D mesh and optional category information, while the output is a sequence of tokens representing the skeletal tree.
Bone Point Cross-Attention Mechanism: The point cloud encoder and bone encoder are used to extract features from the point cloud and the skeletal tree, respectively. A cross-attention mechanism combines these features to predict skinning weights.
Large-Scale Dataset: To train and evaluate UniRig, researchers constructed the Rig-XL dataset, which contains over 14,000 3D models across various categories. The diversity and scale of the dataset enable UniRig to learn different types of skeletal structures and skinning weights, improving the model’s generalization ability.
Physics Simulation-Assisted Training: During the training process, physics simulation is introduced to evaluate the rationality of the predicted skinning weights and skeletal attributes based on the motion of simulated bones under physical forces (e.g., gravity, elastic forces). This indirect supervision method guides the model to learn realistic skinning weights, enhancing the realism of the animations.

The project address of UniRig

Project Website: https://zjp-shadow.github.io/works/UniRig/
GitHub Repository: https://github.com/VAST-AI-Research/UniRig
Hugging Face Model Hub: https://huggingface.co/VAST-AI/UniRig
arXiv Technical Paper: https://zjp-shadow.github.io/works/UniRig/static/supp/UniRig.pdf

Application scenarios of UniRig

Animation Production: Quickly generate bones and skin weights, reducing manual operations and enhancing animation production efficiency.
Virtual Characters: Generate natural and smooth bone bindings for virtual characters (such as VTubers), supporting real-time animation.
Game Development: Rapidly create bone bindings for characters and objects, supporting dynamic effects to enhance game visuals.
3D Content Creation: Suitable for architectural design, industrial design, and more, supporting bone bindings for a variety of 3D models.
Education Field: Serve as a teaching tool to help learners quickly grasp the fundamental concepts of skeletal animation.