Towards Offline Imitation Learning: Strictly Batch Settings, Generalization, and Variability in Expertise

#1 Towards Offline Imitation Learning: Strictly Batch Settings, Generalization, and Variability in Expertise [PDF] [Copy] [Kimi] [REL]

My doctoral research develops a unified framework for offline imitation learning (IL) that tackles three central challenges: achieving sample efficiency in strictly batch settings, ensuring robustness and generalization under dynamics shifts, and learning from demonstrations of varying quality. At the core of this work is a new paradigm for strictly offline IL based on enforcing the Markov Balance Equation (MBE), a fundamental structural property of trajectory data. Using advanced conditional density estimation, I developed two algorithms, CKIL and MBIL, which achieve state-of-the-art performance in high-dimensional continuous-control tasks. Building upon this foundation, I developed the first Distributionally Robust Offline IL framework under a stationarity constraint, enabling robustness to transition-model mismatch without requiring any additional interaction. I am now extending this direction through Robust Behavior Foundation Models (RBFMs), which aim to generalize across dynamics shifts for a wide range of tasks. Finally, I propose a variational approach for learning from crowdsourced demonstrations by inferring and accounting for demonstrator expertise. Together, these contributions yield principled and practical IL algorithms with strong performance and robustness, broadening the applicability of IL to real-world domains such as robotics, healthcare, and autonomous systems.

Subject: AAAI.2026 - Doctoral Consortium Track

42139@AAAI

#1 Towards Offline Imitation Learning: Strictly Batch Settings, Generalization, and Variability in Expertise [PDF] [Copy] [Kimi] [REL]