LBM – An AI Image Conversion Framework for Controllable Shadow Generation
What is LBM?
LBM (Latent Bridge Matching) is a novel image-to-image translation framework introduced by the Jasper Research team. It performs fast and efficient image transformation by building bridge matching in latent space. LBM supports single-step inference, making it suitable for a wide range of image translation tasks, including object removal, relighting, depth and normal estimation, and more.
The model constructs random paths between source and target images using a Brownian Bridge in latent space, increasing sample diversity. Its conditional framework enables controllable shadow generation and image relighting. LBM achieves state-of-the-art or superior performance across various tasks, demonstrating strong generalizability and efficiency.
Key Features of LBM
-
Object Removal: Removes specified objects and associated shadows from images while preserving the background seamlessly.
-
Image Relighting: Relights foreground elements based on given background or lighting conditions, removing existing shadows and reflections.
-
Image Restoration: Restores degraded images to their original quality by translating them into cleaner versions.
-
Depth/Normal Map Generation: Converts input images into depth maps or surface normal maps for use in 3D reconstruction tasks.
-
Controllable Shadow Generation: Creates realistic shadows based on the position, color, and intensity of light sources to enhance visual realism.
Technical Foundations of LBM
-
Latent Space Encoding: Both source and target images are encoded into a low-dimensional latent space, reducing computational cost and improving scalability.
-
Brownian Bridge: Constructs a stochastic path — a Brownian Bridge — in latent space that connects the latent representations of the source and target images. This randomness allows for diverse output generation.
-
Stochastic Differential Equations (SDEs): Predicts latent representations along the Brownian path by solving stochastic differential equations, enabling accurate image translation.
-
Conditional Framework: Supports additional control inputs such as lighting maps, enabling tasks like controllable relighting and shadow synthesis.
-
Pixel-Level Loss: Trains the model using pixel-level losses such as LPIPS to ensure visual consistency between generated and target images.
Project Links
-
Official Website: https://gojasper.github.io/latent-bridge-matching/
-
GitHub Repository: https://github.com/gojasper/LBM
-
arXiv Paper: https://arxiv.org/pdf/2503.07535
-
Online Demo: https://huggingface.co/spaces/jasperai/LBM
Application Scenarios for LBM
-
General Users: Everyday photo editing tasks such as removing unwanted objects, restoring old photos, or adjusting lighting conditions.
-
Photography Enthusiasts: Post-processing to enhance realism, add or modify shadows and lighting effects.
-
Graphic Designers: Creative workflows that involve generating depth/normal maps, rapid image correction, and adjustment.
-
Video Editors: Frame-by-frame video enhancement, including object lighting and shadow adjustments.
-
3D Modelers: Generating depth or surface normal maps from photos to assist in 3D modeling and reconstruction tasks.