BizGen – An AI Infographic Generation Tool Jointly Launched by Tsinghua University and Microsoft
What is BizGen?
BizGen is an AI-powered information graphic generation tool jointly launched by Tsinghua University and Microsoft Research. It specializes in article-level visual text rendering. It can instantly convert long articles into professional-grade infographics and slides, addressing issues such as blurry text and messy layouts commonly encountered with traditional tools when handling long texts. Leveraging a high-quality dataset, Infographics-650K, and an advanced “layout-guided cross-attention mechanism,” BizGen can break down long texts into smaller instructions and precisely inject them into different regions of the image.

The main functions of BizGen
- High-Quality Content Generation: Automatically generate professional-quality infographics and slides based on user-input article content, addressing issues such as blurry text and messy layouts commonly encountered with traditional tools when handling long articles.
- Multi-Language and Style Support: Supports ten different languages and can generate infographics in a variety of styles to meet diverse needs.
- Multi-Layer Transparent Infographics: Excels in creating multi-layer transparent infographics, offering more flexible and diverse information presentation.
- High Accuracy and Typographic Quality: Achieves significantly higher text spelling accuracy compared to other models, and user studies show its typographic quality is preferred.
- Powerful Technical Support: Built on the Infographics-650K dataset and incorporates a “layout-guided cross-attention mechanism” to ensure precise control over every visual element and text region.
The technical principles of BizGen
- High-Quality Dataset: The BizGen team has constructed the Infographics-650K dataset, an unprecedentedly large-scale and high-quality commercial content dataset. It contains 650,000 exquisite business infographics and slides, each accompanied by detailed layout information and descriptions. This lays a solid foundation for models to learn and understand complex commercial designs.
- Layout-Guided Cross-Attention Mechanism: This mechanism can decompose long article-level prompts into “micro-instructions” tailored to different regions. Based on a predefined ultra-high-density layout, it precisely injects these instructions into various regions of the image. This ensures fine-grained control over each visual element and text region, avoiding the confusion and errors often caused by global processing in traditional methods.
- Layout-Conditional Controlled Generation: During the inference stage, BizGen employs the “layout-conditional controlled generation” method. Acting like a meticulous quality inspector, it carefully examines each sub-region during generation, promptly correcting any potential flaws to ensure the final output is flawless.
The project address of BizGen
- Project official website: https://bizgen-msra.github.io/
- GitHub repository: https://github.com/1230young/bizgen
- HuggingFace model hub: https://huggingface.co/PYY2001/BizGen
- arXiv technical paper: https://arxiv.org/pdf/2503.20672
Application scenarios of BizGen
- Business Reporting: Quickly generate high-quality business reports and presentation slides.
- Product Showcase: Create attractive product promotional posters and slideshows.
- Academic Research: Generate academic reports and presentation slides.
- Social Media: Produce engaging social media content.
- Education Sector: Assist teachers in quickly creating teaching courseware.
- Advertising Design: Automatically generate beautifully designed advertisements that align with the theme.
© Copyright Notice
The copyright of the article belongs to the author. Please do not reprint without permission.
Related Posts
No comments yet...