82r0lqYIWg@OpenReview

Total: 1

#1 Pretraining a Shared Q-Network for Data-Efficient Offline Reinforcement Learning [PDF1] [Copy] [Kimi] [REL]

Authors: Jongchan Park, Mingyu Park, Donghwan Lee

Offline reinforcement learning (RL) aims to learn a policy from a fixed dataset without additional environment interaction. However, effective offline policy learning often requires a large and diverse dataset to mitigate epistemic uncertainty. Collecting such data demands substantial online interactions, which are costly or infeasible in many real-world domains. Therefore, improving policy learning from limited offline data—achieving high data efficiency—is critical for practical offline RL. In this paper, we propose a simple yet effective plug-and-play pretraining framework that initializes the feature representation of a $Q$-network to enhance data efficiency in offline RL. Our approach employs a shared $Q$-network architecture trained in two stages: pretraining a backbone feature extractor with a transition prediction head; training a $Q$-network—combining the backbone feature extractor and a $Q$-value head—with *any* offline RL objective. Extensive experiments on the D4RL, Robomimic, V-D4RL, and ExoRL benchmarks show that our method substantially improves both performance and data efficiency across diverse datasets and domains. Remarkably, with only **10\%** of the dataset, our approach outperforms standard offline RL baselines trained on the full data.

Subject: NeurIPS.2025 - Poster