Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics

f4qxkR6GQK@OpenReview

Total: 1

#1 Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics [PDF¹] [Copy] [Kimi] [REL]

Authors: Tyler Kastner, Mark Rowland, Yunhao Tang, Murat Erdogdu, Amir-massoud Farahmand

We study the problem of distributional reinforcement learning using categorical parametrisations and a KL divergence loss. Previous work analyzing categorical distributional RL has done so using a Cramér distance-based loss, simplifying the analysis but creating a theory-practice gap. We introduce a preconditioned version of the algorithm, and prove that it is guaranteed to converge. We further derive the asymptotic variance of the categorical estimates under different learning rate regimes, and compare to that of classical reinforcement learning. We finally empirically validate our theoretical results and perform an empirical investigation into the relative strengths of using KL losses, and derive a number of actionable insights for practitioners.

Subject: ICML.2025 - Poster

f4qxkR6GQK@OpenReview

#1 Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics [PDF1] [Copy] [Kimi] [REL]

#1 Categorical Distributional Reinforcement Learning with Kullback-Leibler Divergence: Convergence and Asymptotics [PDF¹] [Copy] [Kimi] [REL]