Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models

#1 Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models [PDF²⁰] [Copy] [Kimi¹⁶] [REL]

Authors: Zejian Li, Yize Li, Chenye Meng, Zhongni Liu, Yang Ling, Shengyuan Zhang, Guang Yang, Changyuan Yang, Zhiyuan Yang, Lingyun Sun

Recent advancements in diffusion models (DMs) have been propelled by alignment methods that post-train models to better conform to human preferences. However, these approaches typically require computation-intensive training of a base model and a reward model, which not only incurs substantial computational overhead but may also compromise model accuracy and training efficiency. To address these limitations, we propose Inversion-DPO, a novel alignment framework that circumvents reward modeling by reformulating Direct Preference Optimization (DPO) with DDIM inversion for DMs. Our method conducts intractable posterior sampling in Diffusion-DPO with the deterministic inversion from winning and losing samples to noise and thus derive a new post-training paradigm. This paradigm eliminates the need for auxiliary reward models or inaccurate appromixation, significantly enhancing both precision and efficiency of training. We apply Inversion-DPO to a basic task of text-to-image generation and a challenging task of compositional image generation. Extensive experiments show substantial performance improvements achieved by Inversion-DPO compared to existing post-training methods and highlight the ability of the trained generative models to generate high-fidelity compositionally coherent images. For the post-training of compostitional image geneation, we curate a paired dataset consisting of 11,140 images with complex structural annotations and comprehensive scores, designed to enhance the compositional capabilities of generative models. Inversion-DPO explores a new avenue for efficient, high-precision alignment in diffusion models, advancing their applicability to complex realistic generation tasks. Our code is available at https://github.com/MIGHTYEZ/Inversion-DPO

Subjects: Computer Vision and Pattern Recognition , Artificial Intelligence

Publish: 2025-07-14 02:59:28 UTC

2507.11554

#1 Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models [PDF20] [Copy] [Kimi16] [REL]

#1 Inversion-DPO: Precise and Efficient Post-Training for Diffusion Models [PDF²⁰] [Copy] [Kimi¹⁶] [REL]