Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making

#1 Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making [PDF³] [Copy] [Kimi²] [REL]

Authors: Xu Wan, Wenyue Xu, Chao Yang, Mingyang Sun

Recent advancements in Large Language Models (LLMs) and Reinforcement Learning (RL) have shown significant promise in decision-making tasks. Nevertheless, for large-scale industrial decision problems, both approaches face distinct challenges: LLMs lack real-time long-sequence decision-making capabilities, while RL struggles with sample efficiency in vast action spaces. To bridge this gap, we propose **A**gents **C**o-**E**volution (ACE), a synergistic framework between LLMs and RL agent for large-scale decision-making scenarios. ACE introduces a dual-role trajectory refinement mechanism where LLMs act as both Policy Actor and Value Critic during RL's training: the Actor refines suboptimal actions via multi-step reasoning and environment validation, while the Critic performs temporal credit assignment through trajectory-level reward shaping. Concurrently, RL agent enhance LLMs' task-specific decision-making via prioritized experience replay.Through extensive experiments across multiple power grid operation challenges with action spaces exceeding 60K discrete actions, ACE demonstrates superior performance over existing RL methods and LLM-based methods.

Subject: ICML.2025 - Poster

ySWrJer7mW@OpenReview

#1 Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making [PDF3] [Copy] [Kimi2] [REL]

#1 Think Twice, Act Once: A Co-Evolution Framework of LLM and RL for Large-Scale Decision Making [PDF³] [Copy] [Kimi²] [REL]