AnimeGamer – An anime life simulation system jointly launched by Tencent and City University of Hong Kong
What is AnimeGamer?
AnimeGamer is an infinite anime life simulation system jointly developed by Tencent PCG and the City University of Hong Kong. Built upon a Multimodal Large Language Model (MLLM), it enables players to immerse themselves in a dynamic gaming world as anime characters through open-ended linguistic instructions. Players can control characters like Sosuke from Ponyo on the Cliff by the Sea to interact with the game world. The system supports the generation of contextually consistent dynamic animated scenes (videos) and real-time updates of character attributes such as stamina, social, and entertainment levels. Compared to traditional approaches, AnimeGamer excels in character consistency, semantic coherence, and motion control, delivering a deeply immersive anime gaming experience for players.

The main functions of AnimeGame
- Role-Playing and Interaction: Players assume the roles of anime characters, such as Sosuke from Ponyo on the Cliff by the Sea, to interact with the game world, enabling characters from different anime series to meet and engage with one another.
- Dynamic Animation Generation: Real-time generation of dynamic animation shots (videos) based on player instructions, showcasing character movements and scene changes. The animations maintain contextual consistency and dynamism.
- Character Status Update: Dynamically updates the character’s stamina, social, and entertainment values based on their actions and interactions, reflecting the character’s state changes in the game world.
- Multi-turn Dialogue Interaction: Supports players in conducting multi-turn conversations in natural language. The model generates consistent game states based on historical context, providing a seamless gaming experience.
- Customizable Game Content: Allows players to customize their preferred characters and scenes.
The Technical Principles of AnimeGame
- Multimodal Large Language Model: AnimeGamer is based on a Multimodal Large Language Model (MLLM), capable of understanding and generating multimodal data that includes both text and visual information.
- Action-Aware Multimodal Representation: The game decomposes animation shots into three components: visual references, action descriptions, and action intensity. These are integrated into a multimodal representation using an encoder. The video diffusion model then decodes this representation into high-quality dynamic videos, ensuring that the generated animation shots maintain contextual consistency and dynamism.
- Video Diffusion Model: The video diffusion model (such as CogVideoX) serves as the decoder for animation shots, transforming the multimodal representation into dynamic videos. By introducing action intensity as an additional condition, the model controls the magnitude of actions in the generated videos, making the animations more natural and realistic.
- Contextual Consistency: By inputting the multimodal representations of historical animation shots as context, the model can predict subsequent game states, ensuring that the generated animation shots remain consistent within the context. This is crucial for maintaining the coherence and immersion of the game.
- Character State Management: The game uses MLLM to predict updates to characters’ stamina, social, and entertainment values. These state updates reflect the characters’ behaviors and interactions within the game world. The dynamic updating of character states enhances the realism and interactivity of the game.
The project address of AnimeGame
- Project official website: https://howe125.github.io/AnimeGamer.github.io/
- GitHub repository: https://github.com/TencentARC/AnimeGamer
- Hugging Face model hub: https://huggingface.co/TencentARC/AnimeGamer
- arXiv technical paper: https://arxiv.org/pdf/2504.01014
Application scenarios of AnimeGame
- Personalized Entertainment: Players can select their favorite anime characters and scenes, and experience exclusive adventure stories based on voice commands.
- Inspiration Generation: Provide inspiration for creators to generate character interactions and new plots.
- Educational Assistance: Help students learn language expression and logical thinking.
- Social Interaction: Players can create and share anime adventure stories together with their friends.
- Game Development: Assist developers in quickly generating game content and reducing development costs.