Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning

#1 Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]

The goal of offline reinforcement learning (RL) is to extract the best possible policy from the previously collected dataset considering the *out-of-distribution* (OOD) sample issue. Offline model-based RL (MBRL) is a captivating solution capable of alleviating such issues through a \textit{state-action transition augmentation} with a learned dynamic model. Unfortunately, offline MBRL methods have been observed to fail in sparse rewarded and long-horizon environments for a long time. In this work, we propose a novel MBRL method, dubbed Temporal Distance-Aware Transition Augmentation (TempDATA), that generates additional transitions in a geometrically structured representation space, instead of state space. For comprehending long-horizon behaviors efficiently, our main idea is to learn state abstraction, which captures a *temporal distance* from both *trajectory and transition levels* of state space. Our experiments empirically confirm that TempDATA outperforms previous offline MBRL methods and achieves matching or surpassing the performance of diffusion-based trajectory augmentation and goal-conditioned RL on the D4RL AntMaze, FrankaKitchen, CALVIN, and pixel-based FrankaKitchen.

Subject: ICML.2025 - Poster

drBVowFvqf@OpenReview

#1 Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning [PDF1] [Copy] [Kimi] [REL]

#1 Temporal Distance-aware Transition Augmentation for Offline Model-based Reinforcement Learning [PDF¹] [Copy] [Kimi] [REL]