TaoAvatar – Alibaba’s Real-time High-definition 3D Full-body Conversational Digital Human Technology
What is TaoAvatar?
TaoAvatar is a high-fidelity, lightweight 3D full-body conversational virtual human technology developed by Alibaba Group’s research team. Based on 3D Gaussian Splatting technology, it can generate photo-realistic 3D full-body virtual avatars, supporting high-resolution rendering with low storage requirements. TaoAvatar can run in real-time at a high frame rate of 90FPS on various mobile and AR devices, and achieve natural synchronization of lip movements, expressions, and gestures through multiple signal inputs such as voice, facial expressions, hand gestures, and body postures.

The main functions of TaoAvatar
- High-Fidelity Full-Body Dynamic Virtual Avatar Generation: Capable of generating realistic, topologically consistent 3D full-body virtual avatars from multi-view image sequences, with support for fine control over pose, gestures, and facial expressions.
- Real-Time Rendering with Low Storage Requirements: Can run in real-time at high frame rates of 90FPS on a variety of mobile and AR devices, supporting high-resolution rendering while maintaining low storage demands.
- Multi-Signal Driven: Supports driving through multiple signals such as voice, facial expressions, gestures, and body postures, achieving natural synchronization of lip movements, expressions, and actions.
- Lightweight Architecture: Significantly improves runtime efficiency by “baking” complex non-rigid deformations into a lightweight MLP network, combined with blended shape compensation for details.
The Technical Principles of TaoAvatar
- 3D Gaussian Splatting (3DGS) Technology: 3DGS represents points in a scene using 3D Gaussian functions, which are then projected onto a 2D image plane for rendering. Each 3D Gaussian is described by parameters such as position, covariance, color, and transparency. The Structure from Motion (SfM) technique is used to estimate a 3D point cloud from multi-view images, after which each point is converted into a Gaussian function and trained using stochastic gradient descent.
- Pose-Dependent Non-Rigid Deformation Handling: TaoAvatar decomposes complex non-rigid deformations into rigid deformations and shape deformations. By leveraging knowledge distillation techniques, the shape deformation is “baked” into a lightweight MLP network. This enables efficient handling of complex pose-dependent non-rigid deformations while maintaining the realism and controllability of the virtual avatar.
- Learnable Gaussian Mixture Shapes: To further enhance the appearance details of the virtual avatar, TaoAvatar introduces learnable Gaussian mixture shapes. A neural network is trained to learn the parameters of Gaussian mixture shapes under different poses and expressions, and these parameters are then applied to the virtual avatar. This ensures that the virtual avatar maintains extremely high fidelity across various poses and expressions.
- Real-Time Rendering and Optimization: TaoAvatar employs a variety of optimization techniques, such as GPU acceleration, reduction of unnecessary computations, and optimization of model structure and parameters, to achieve high-quality real-time rendering. On high-definition stereoscopic display devices like the Apple Vision Pro, it can maintain smooth operation at 90 frames per second.
The project address of TaoAvatar
- Project official website: https://pixelai-team.github.io/TaoAvatar/
- arXiv technical paper: https://arxiv.org/pdf/2503.17032
Application scenarios of TaoAvatar
- E-commerce live streaming: Create realistic virtual hosts to enhance user experience and reduce labor costs.
- Holographic communication: Generate realistic virtual images in remote communication to enhance the sense of immersion.
- Virtual meetings: Participants can use personalized virtual images to communicate, enhancing interactivity.
- Online education: Use virtual humans to teach online courses to increase fun.
- Virtual entertainment: Create personalized virtual characters in games and virtual reality applications.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...