AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO

#1 AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Jiazi Bu, Pengyang Ling, Yujie Zhou, Yibin Wang, Yuhang Zang, Tianyi Wei, Xiaohang Zhan, Jiaqi Wang, Tong Wu, Xingang Pan, Dahua Lin

Group Relative Policy Optimization (GRPO) has demonstrated remarkable success in aligning text-to-image (T2I) flow models with human preferences. However, we have identified that the learning loop of current flow-based GRPO is fundamentally decoupled from the learner's current capability, suffering from critical blind spots at both prompt selection and advantage estimation: (i) Existing methods sample prompts randomly, overlooking the substantial impact of data selection on reinforcement learning (RL) efficacy--a factor proven crucial in GRPO for large language models; (ii) They evaluate sample quality solely relying on intra-group statistics, lacking a global perspective to accurately measure true policy improvement. To address these issues, we propose Adaptive GRPO (AdaGRPO), a novel capability-aware RL algorithm tailored for flow models. Specifically, AdaGRPO consists of two principal components: (i) Online Curriculum Filtering Strategy: Dynamically tracks the model's proficiency and adaptively selects prompts that best match its current learning boundary; (ii) Cross-Level Advantage Fusion: Synergistically integrates fine-grained intra-group advantages with macro-level global advantages, providing a comprehensive and unbiased policy evaluation. As a lightweight, plug-and-play module, AdaGRPO can be seamlessly integrated with existing frameworks such as Flow-GRPO, DanceGRPO, and Flow-CPS. Extensive experiments demonstrate that AdaGRPO consistently drives performance gains while significantly stabilizes GRPO training for flow models.

Subjects: Computer Vision and Pattern Recognition , Machine Learning

Publish: 2026-06-05 02:07:08 UTC

2606.06828

#1 AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO [PDF1] [Copy] [Kimi1] [REL]

#1 AdaGRPO: A Capability-Aware Adaptive Enhancement for Flow-based GRPO [PDF¹] [Copy] [Kimi¹] [REL]