Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning

#1 Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning [PDF¹] [Copy] [Kimi²] [REL]

Authors: Sung June Kim, Gyeongrok Oh, Heeju Ko, Daehyun Ji, Dongwook Lee, Byung-Jun Lee, Sujin Jang, Sangpil Kim

Navigating in an unfamiliar environment during deployment poses a critical challenge for a vision-language navigation (VLN) agent. Yet, test-time adaptation (TTA) remains relatively underexplored in robotic navigation, leading us to the fundamental question: what are the key properties of TTA for online VLN? In our view, effective adaptation requires three qualities: 1) flexibility in handling different navigation outcomes, 2) interactivity with external environment, and 3) maintaining a harmony between plasticity and stability. To address this, we introduce FeedTTA, a novel TTA framework for online VLN utilizing feedback-based reinforcement learning. Specifically, FeedTTA learns by maximizing binary episodic feedback, a practical setup in which the agent receives a binary scalar after each episode that indicates the success or failure of the navigation. Additionally, we propose a gradient regularization technique that leverages the binary structure of FeedTTA to achieve a balance between plasticity and stability during adaptation. Our extensive experiments on challenging VLN benchmarks demonstrate the superior adaptability of FeedTTA, even outperforming the state-of-the-art offline training methods in REVERIE benchmark with a single stream of learning.

Subject: ICML.2025 - Poster

K4GaB4fdIq@OpenReview

#1 Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning [PDF1] [Copy] [Kimi2] [REL]

#1 Test-Time Adaptation for Online Vision-Language Navigation with Feedback-based Reinforcement Learning [PDF¹] [Copy] [Kimi²] [REL]