A Reduction Framework for Distributionally Robust Reinforcement Learning under Average Reward

#1 A Reduction Framework for Distributionally Robust Reinforcement Learning under Average Reward [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Zachary Roch, George Atia, Yue Wang

Robust reinforcement learning (RL) under the average reward criterion, which seeks to optimize long-term system performance in uncertain environments, remains a largely unexplored area. To address this challenge, we propose a reduction-based framework that transforms robust average reward optimization into the more extensively studied robust discounted reward optimization by employing a specific discount factor. Our framework provides two key advantages. **Data Efficiency**: We design a model-based reduction algorithm that achieves near-optimal sample complexity, enabling efficient identification of optimal robust policies; **Scalability**: By bypassing the inherent challenges of scaling up average reward optimization, our framework facilitates the design of scalable, convergent algorithms for robust average reward optimization leveraging function approximation. Our algorithmic design, supported by theoretical and empirical analyses, provides a concrete solution to robust average reward RL with the first data efficiency and scalability guarantees, highlighting the framework’s potential to optimize long-term performance under model uncertainty in practical problems.

Subject: ICML.2025 - Poster

B5p7bjTI8v@OpenReview

#1 A Reduction Framework for Distributionally Robust Reinforcement Learning under Average Reward [PDF1] [Copy] [Kimi1] [REL]

#1 A Reduction Framework for Distributionally Robust Reinforcement Learning under Average Reward [PDF¹] [Copy] [Kimi¹] [REL]