FLUX.2 – Black Forest Labs’ Open-Source AI Image Generation and Editing Model
What is FLUX.2?
FLUX.2 is an AI image model released by Black Forest Labs, designed specifically for real-world creative workflows. The model supports multi-image referencing with up to 10 images and can generate high-quality images at up to 4MP resolution, delivering exceptional detail and strong text-rendering capabilities. FLUX.2 comes in several variants, including the high-performance FLUX.2 [pro], the parameter-customizable FLUX.2 [flex], the open-source FLUX.2 [dev], and the upcoming lightweight FLUX.2 [klein].
By combining a vision-language model with a flow transformer architecture, FLUX.2 significantly enhances real-world knowledge understanding and image generation quality, driving open innovation and broad adoption in visual intelligence technology.

Key Features of FLUX.2
• Multi-Image Reference
Supports referencing up to 10 images simultaneously, maintaining consistency in characters, styles, and product appearance.
• High-Resolution Image Generation
Generates and edits images up to 4MP, suitable for product photography, visualization, and photography-level creative work.
• Complex Text Rendering
Handles complex typography, infographics, memes, and UI designs, supporting small and readable text.
• Enhanced Instruction Following
Improved ability to follow complex and structured instructions, including multi-part prompts and compositional constraints.
• Real-World Knowledge
Stronger performance in lighting, spatial logic, and scene coherence, producing images that better align with real-world physics and context.
Technical Principles Behind FLUX.2
1. Latent Flow Matching Architecture
FLUX.2 adopts a latent flow matching framework, enabling efficient image generation and editing directly within latent space while maintaining coherence and consistency. This architecture excels in complex image synthesis, particularly in multi-image reference and high-resolution tasks.
2. Coupling of Vision-Language Model and Flow Transformer
FLUX.2 integrates a 24B-parameter Mistral-3 vision-language model (VLM) with a flow transformer.
-
The VLM provides rich real-world knowledge and semantic understanding, enabling the model to comprehend complex prompts and scene logic.
-
The flow transformer focuses on spatial relationships, material attributes, and compositional reasoning in images.
This coupling significantly enhances FLUX.2’s ability to generate complex scenes and fine details, especially when handling multi-image references and text-heavy outputs.
3. Improved Variational Autoencoder (VAE)
FLUX.2 introduces a newly optimized VAE for latent representation.
The VAE balances learnability, image quality, and compression efficiency.
By retraining the latent space, FLUX.2 resolves the “learnability-quality-compression” trade-off, achieving better image quality and more efficient generation.
4. Multi-Image Reference and Style Consistency
With support for up to 10 reference images, FLUX.2 uses advanced multi-image fusion algorithms to ensure consistent style, character identity, and product details.
This makes it ideal for workflows requiring brand consistency or scene coherence, such as advertising, product visualization, and film post-production.
FLUX.2 Project Links
-
Official Website: https://bfl.ai/blog/flux-2
-
Hugging Face Model Collection: https://huggingface.co/collections/black-forest-labs/flux2
How to Use FLUX.2
• FLUX.2 [pro]
Use via BFL Playground or BFL API.
Suitable for production use, with no local deployment required.
• FLUX.2 [flex]
Available via bfl.ai/play or BFL API.
Adjustable generation parameters—ideal for developers needing fine-grained control.
• FLUX.2 [dev]
Open-source weights accessible on Hugging Face.
Download and run locally with reference inference code for customized development.
• FLUX.2 [klein] (coming soon)
A lightweight open-source version.
Developers can join Beta testing for local experimentation and innovation:
https://docs.google.com/forms/d/e/1FAIpQLScOIvOkHN2fPbD8cFsAf7MQJfqu2bnEmoNb0x1k3ismTLLm-Q/viewform
• FLUX.2 – VAE
A new VAE for latent representation, used across the FLUX.2 model family.
Available through the Hugging Face repository.
Application Scenarios of FLUX.2
• Advertising Production
Quickly generates high-quality product ads.
Multi-image reference ensures consistent brand styling and enables creative visual concepts using complex prompts.
• UI/UX Design
Supports complex layout and text rendering, useful for UI prototypes and design drafts.
Helps designers rapidly iterate on creative ideas.
• Brand Marketing
Generates and edits high-resolution visual content, maintaining consistent brand identity across different media.
• Film Visual Effects
Creates realistic scenes, props, and characters.
Multi-image reference ensures coherent visual style, reducing time and cost in VFX workflows.
• Animation Production
Generates high-quality animated frames and backgrounds, accelerating the production process while preserving stylistic uniformity.