LBM – An AI Image Conversion Framework for Controllable Shadow Generation

What is LBM?

LBM (Latent Bridge Matching) is a novel image-to-image translation framework introduced by the Jasper Research team. It performs fast and efficient image transformation by building bridge matching in latent space. LBM supports single-step inference, making it suitable for a wide range of image translation tasks, including object removal, relighting, depth and normal estimation, and more.

The model constructs random paths between source and target images using a Brownian Bridge in latent space, increasing sample diversity. Its conditional framework enables controllable shadow generation and image relighting. LBM achieves state-of-the-art or superior performance across various tasks, demonstrating strong generalizability and efficiency.

Key Features of LBM

Object Removal: Removes specified objects and associated shadows from images while preserving the background seamlessly.
Image Relighting: Relights foreground elements based on given background or lighting conditions, removing existing shadows and reflections.
Image Restoration: Restores degraded images to their original quality by translating them into cleaner versions.
Depth/Normal Map Generation: Converts input images into depth maps or surface normal maps for use in 3D reconstruction tasks.
Controllable Shadow Generation: Creates realistic shadows based on the position, color, and intensity of light sources to enhance visual realism.

Technical Foundations of LBM

Latent Space Encoding: Both source and target images are encoded into a low-dimensional latent space, reducing computational cost and improving scalability.
Brownian Bridge: Constructs a stochastic path — a Brownian Bridge — in latent space that connects the latent representations of the source and target images. This randomness allows for diverse output generation.
Stochastic Differential Equations (SDEs): Predicts latent representations along the Brownian path by solving stochastic differential equations, enabling accurate image translation.
Conditional Framework: Supports additional control inputs such as lighting maps, enabling tasks like controllable relighting and shadow synthesis.
Pixel-Level Loss: Trains the model using pixel-level losses such as LPIPS to ensure visual consistency between generated and target images.

Project Links

Official Website: https://gojasper.github.io/latent-bridge-matching/
GitHub Repository: https://github.com/gojasper/LBM
arXiv Paper: https://arxiv.org/pdf/2503.07535
Online Demo: https://huggingface.co/spaces/jasperai/LBM

Application Scenarios for LBM

General Users: Everyday photo editing tasks such as removing unwanted objects, restoring old photos, or adjusting lighting conditions.
Photography Enthusiasts: Post-processing to enhance realism, add or modify shadows and lighting effects.
Graphic Designers: Creative workflows that involve generating depth/normal maps, rapid image correction, and adjustment.
Video Editors: Frame-by-frame video enhancement, including object lighting and shadow adjustments.
3D Modelers: Generating depth or surface normal maps from photos to assist in 3D modeling and reconstruction tasks.