ObjectMover – A Novel Image Editing Model Jointly Developed by the University of Hong Kong and Adobe

What is ObjectMover?

ObjectMover is a novel image editing model jointly proposed by the University of Hong Kong and Adobe Research. It addresses issues such as inconsistent lighting, shadows, and object distortion that occur when moving, inserting, or removing objects in images. By treating object movement as a special case of two-frame video processing, the model leverages the cross-frame consistency learning ability of pre-trained video generation models. Through fine-tuning, the model is adapted for image editing tasks. ObjectMover adopts a sequence-to-sequence modeling approach, taking as input the original image, the target object image, and an instruction map, and outputs a synthesized image with the object moved.

The main functions of ObjectMover

Object Movement: Move objects in the image to a specified location, automatically adjusting relevant physical effects such as lighting, shadows, reflections, etc., while maintaining the identity characteristics of the objects.
Object Deletion: Realistically fill in the background of the removed object without generating irrelevant new objects, and accurately remove the light and shadow related to the object.
Object Insertion: Precisely maintain the identity characteristics of the inserted object and automatically generate light and shadow effects consistent with the environment.

The technical principle of ObjectMover

Video Prior Transfer: ObjectMover treats the object-moving task as a special case of two-frame video processing, leveraging the learning ability of pre-trained video generation models (such as diffusion models) regarding cross-frame consistency. By fine-tuning the model, it is transferred from the video generation task to the image editing task. This approach can fully utilize the physical laws and object correspondence learned during the pre-training of video models, enabling precise light-shadow synchronization and identity feature preservation in image editing tasks.
Sequence-to-Sequence Modeling: The model reframes the object movement task as a sequence prediction problem. The inputs include the original image, the target object image, and the instruction map (annotating the movement position and direction), while the output is the synthesized image of the object after movement. This enables the model to better understand and handle the lighting changes and occlusion relationships of objects in different positions.
Synthetic Dataset Construction: Due to the lack of large-scale real-world data on object movement, the research team utilized modern game engines (such as Unreal Engine) to generate high-quality synthetic data pairs. The data covers complex scenarios of lighting, materials, and occlusions, enhancing the diversity and generalization ability of model training.
Multi-Task Learning Strategy: ObjectMover integrates four sub-tasks—object movement, removal, insertion, and video data insertion—into a unified framework for training on both synthetic data and real video data. This approach improves the model’s generalization ability in real-world scenarios, enabling it to exhibit higher adaptability and robustness when handling various image editing tasks.

The project address of ObjectMover

Project official website: https://xinyu-andy.github.io/ObjMover/
arXiv technical paper: https://arxiv.org/pdf/2503.08037

Application scenarios of ObjectMover

Special Effects Production: For some complex special effects scenes, such as the disappearance or appearance of objects, ObjectMover can achieve the deletion and insertion of objects while maintaining the realism of the scene.
Virtual Scene Editing: In virtual reality and game development, it is necessary to flexibly adjust the objects in the virtual scene. ObjectMover can be used to move objects in the scene, such as moving a prop from one position to another while keeping the object’s lighting and shadows consistent with the environment.
Game Level Design: Developers can use ObjectMover to quickly adjust the layout of objects in the level, improving the efficiency of level design.
Product Display: For product advertisements, ObjectMover can be used to place products in different scenes to showcase the products’ various usage scenarios.
Space Planning: In architecture and interior design, ObjectMover can be used to move furniture or decorations to different positions to evaluate different design schemes.