FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning

#1 FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning [PDF] [Copy] [Kimi] [REL]

Authors: Da Wang, Yi Ma, Ting Guo, Hongyao Tang, Wei Wei, Jiye Liang

Offline reinforcement learning (RL) aims to learn optimal policies from static datasets while enhancing generalization to out-of-distribution (OOD) data. To mitigate overfitting to suboptimal behaviors in offline datasets, existing methods often relax constraints on policy and data or extract informative patterns through data-driven techniques. However, there has been limited exploration into structurally guiding the optimization process toward flatter regions of the solution space that offer better generalization. Motivated by this observation, we present \textit{FANS}, a generalization-oriented structured network framework that promotes flatter and robust policy learning by guiding the optimization trajectory through modular architectural design. FANS comprises four key components: (1) Residual Blocks, which facilitate compact and expressive representations; (2) Gaussian Activation, which promotes smoother gradients; (3) Layer Normalization, which mitigates overfitting; and (4) Ensemble Modeling, which reduces estimation variance. By integrating FANS into a standard actor-critic framework, we highlight that this remarkably simple architecture achieves superior performance across various tasks compared to many existing advanced methods. Moreover, we validate the effectiveness of FANS in mitigating overestimation and promoting generalization, demonstrating the promising potential of architectural design in advancing offline RL.

Subject: NeurIPS.2025 - Poster

s3LMqrwwHJ@OpenReview

#1 FANS: A Flatness-Aware Network Structure for Generalization in Offline Reinforcement Learning [PDF] [Copy] [Kimi] [REL]