YX5DHa9OfX@OpenReview

Total: 1

#1 Imitation Beyond Expectation Using Pluralistic Stochastic Dominance [PDF1] [Copy] [Kimi] [REL]

Authors: Ali Farajzadeh, Danyal Saeed, Syed M Abbas, Rushit N. Shah, Aadirupa Saha, Brian D Ziebart

Imitation learning seeks policies reflecting the values of demonstrated behaviors. Prevalent approaches learn to match or exceed the demonstrator's performance in expectation without knowing the demonstrator’s reward function. Unfortunately, this does not induce pluralistic imitators that learn to support qualitatively distinct demonstrations. We reformulate imitation learning using stochastic dominance over the demonstrations' reward distribution across a range of reward functions as our foundational aim. Our approach matches imitator policy samples (or support) with demonstrations using optimal transport theory to define an imitation learning objective over trajectory pairs. We demonstrate the benefits of pluralistic stochastic dominance (PSD) for imitation in both theory and practice.

Subject: NeurIPS.2025 - Spotlight