eZ5QyZV7zi@OpenReview

Total: 1

#1 Policy-Regret Minimization in Markov Games with Function Approximation [PDF] [Copy] [Kimi] [REL]

Authors: Thanh Nguyen-Tang, Raman Arora

We study policy-regret minimization problem in dynamically evolving environments, modeled as Markov games between a learner and a strategic, adaptive opponent. We propose a general algorithmic framework that achieves the optimal $\mathcal{O}(\sqrt{T})$ policy regret for a wide class of large-scale problems characterized by an Eluder-type condition--extending beyond the tabular settings of previous work. Importantly, our framework uncovers a simpler yet powerful algorithmic approach for handling reactive adversaries, demonstrating that leveraging opponent learning in such settings is key to attaining the optimal $\mathcal{O}(\sqrt{T})$ policy regret.

Subject: ICML.2025 - Poster