Scaling Multi-Reference Image Generation with Dynamic Reward Optimization

#1 Scaling Multi-Reference Image Generation with Dynamic Reward Optimization [PDF¹] [Copy] [Kimi³] [REL]

Authors: Wenwang Huang, Yusen Fu, Junjie Wang, Mengfei Huang, Yulin Li, Gan Liu, Jing Cai, Yancheng He, Zhuotao Tian

While personalized image generation has achieved remarkable progress, multi-reference image generation (MRIG) remains a challenging task. Most existing benchmarks fail to adequately evaluate complex MRIG scenarios, hindering further progress in this area. To better assess model performance on complex MRIG tasks, we introduce OmniRef-Bench, a benchmark that covers complex combinations of reference image types and a large number of reference images. Evaluations on OmniRef-Bench show that mainstream open-source models struggle in complex MRIG scenarios, and their performance deteriorates significantly as the number of mixed-type reference images increases. To address this issue, we propose DyRef, a two-stage training framework. In the first stage, supervised fine-tuning equips the model with the basic capability to handle complex MRIG tasks. In the second stage, we introduce Difficulty-aware Advantage Reweighting (DAR) and Discriminative Reward Scaling (DRS). DAR dynamically adjusts the optimization objective to improve performance when handling a large number of mixed-type reference images. DRS enlarges intra-group reward differences for more effective policy optimization. Experiments demonstrate that DyRef significantly improves the performance of open-source models on OmniRef-Bench and single-image editing benchmarks, demonstrating the effectiveness and generalization capability of our approach.

Subjects: Computer Vision and Pattern Recognition , Artificial Intelligence

Publish: 2026-06-25 12:21:13 UTC

2606.26947

#1 Scaling Multi-Reference Image Generation with Dynamic Reward Optimization [PDF1] [Copy] [Kimi3] [REL]

#1 Scaling Multi-Reference Image Generation with Dynamic Reward Optimization [PDF¹] [Copy] [Kimi³] [REL]