Off-policy Reinforcement Learning with Model-based Exploration Augmentation

#1 Off-policy Reinforcement Learning with Model-based Exploration Augmentation [PDF] [Copy] [Kimi²] [REL]

Authors: Likun Wang, Xiangteng Zhang, Yinuo Wang, Guojian Zhan, Wenxuan Wang, Haoyu Gao, Jingliang Duan, Shengbo Eben Li

Exploration is crucial in Reinforcement Learning (RL) as it enables the agent to understand the environment for better decision-making. Existing exploration methods fall into two paradigms: active exploration, which injects stochasticity into the policy but struggles in high-dimensional environments, and passive exploration, which manages the replay buffer to prioritize under-explored regions but lacks sample diversity. To address the limitation in passive exploration, we propose Modelic Generative Exploration (MoGE), which augments exploration through the generation of under-explored critical states and synthesis of dynamics-consistent experiences. MoGE consists of two components: (1) a diffusion generator for critical states under the guidance of entropy and TD error, and (2) a one-step imagination world model for constructing critical transitions for agent learning. Our method is simple to implement and seamlessly integrates with mainstream off-policy RL algorithms without structural modifications. Experiments on OpenAI Gym and DeepMind Control Suite demonstrate that MoGE, as an exploration augmentation, significantly enhances efficiency and performance in complex tasks.

Subject: NeurIPS.2025 - Poster

JGkZgEEjiM@OpenReview

#1 Off-policy Reinforcement Learning with Model-based Exploration Augmentation [PDF] [Copy] [Kimi2] [REL]

#1 Off-policy Reinforcement Learning with Model-based Exploration Augmentation [PDF] [Copy] [Kimi²] [REL]