ReVersion: Diffusion-Based Relation Inversion from Imageshttps://ziqihuangg.github.io/projects/reversion.html
In Relation Inversion the goal is to capture and apply a relation from exemplar images to synthesize new scenes. While existing methods focus on capturing object appearances, this work explores the inversion of object relations. The proposed ReVersion framework learns a relation prompt from exemplar images using a pre-trained text-to-image diffusion model. The key insight is the "preposition prior," where real-world relation prompts can be activated with a set of basis prepositional words. The framework incorporates relation-steering contrastive learning and relation-focal importance sampling to enforce capturing object interactions and disentangling from appearances. The paper also introduces the ReVersion Benchmark, providing diverse exemplar images with different relations. Experimental results validate the superiority of the proposed approach in inverting relations compared to existing methods.