Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation

#1 Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation [PDF²] [Copy] [Kimi³] [REL]

Authors: Yuan Meng, Xiangtong Yao, Haihui Ye, Yirui Zhou, Shengqiang Zhang, Zhenguo Sun, Xukun Li, Zhenshan Bing, Alois Knoll

Embodied long-horizon manipulation requires robotic systems to process multimodal inputs-such as vision and natural language-and translate them into executable actions. However, existing learning-based approaches often depend on large, task-specific datasets and struggle to generalize to unseen scenarios. Recent methods have explored using large language models (LLMs) as high-level planners that decompose tasks into subtasks using natural language and guide pretrained low-level controllers. Yet, these approaches assume perfect execution from low-level policies, which is unrealistic in real-world environments with noise or suboptimal behaviors. To overcome this, we fully discard the pretrained low-level policy and instead use the LLM to directly generate executable code plans within a closed-loop framework. Our planner employs chain-of-thought (CoT)-guided few-shot learning with incrementally structured examples to produce robust and generalizable task plans. Complementing this, a reporter evaluates outcomes using RGB-D and delivers structured feedback, enabling recovery from misalignment and replanning under partial observability. This design eliminates per-step inference, reduces computational overhead, and limits error accumulation that was observed in previous methods. Our framework achieves state-of-the-art performance on 30+ diverse seen and unseen long-horizon tasks across LoHoRavens, CALVIN, Franka Kitchen, and cluttered real-world settings.

Subjects: Robotics , Artificial Intelligence

Publish: 2025-03-27 20:32:58 UTC

2503.21969

#1 Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation [PDF2] [Copy] [Kimi3] [REL]

#1 Embodied Long Horizon Manipulation with Closed-loop Code Generation and Incremental Few-shot Adaptation [PDF²] [Copy] [Kimi³] [REL]