6TDSDdgP7Z@OpenReview

Total: 1

#1 SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering [PDF] [Copy] [Kimi1] [REL]

Authors: Xuehang Guo, Xingyao Wang, Yangyi Chen, Sha Li, Chi Han, Manling Li, Heng Ji

Software engineering (SE) is increasingly collaborative, with developers working together on shared complex codebases. Effective collaboration in shared environments requires participants---whether humans or AI agents---to stay on the same page as their environment evolves. When a collaborator's understanding diverges from the current state---what we term the *out-of-sync* challenge---the collaborator's actions may fail, leading to integration issues. In this work, we introduce **SyncMind**, a framework that systematically defines the *out-of-sync* problem faced by large language model (LLM) agents in collaborative software engineering (CSE). Based on ***SyncMind***, we create **SyncBench**, a benchmark featuring 24,332 instances of agent *out-of-sync* scenarios in real-world CSE derived from 21 popular *GitHub* repositories with executable verification tests. Experiments on ***SyncBench*** uncover critical insights into existing LLM agents' capabilities and limitations. Besides substantial performance gaps among agents (from *Llama-3.1* agents $\leq 3.33\%$ to *Claude-3.5-Sonnet* $\geq 28.18\%$), their consistently low collaboration willingness ($\le 4.86\%$) suggests fundamental limitations of existing LLM in CSE. However, when collaboration occurs, it positively correlates with *out-of-sync* recovery success. Minimal performance differences in agents' resource-aware *out-of-sync* recoveries further reveal their significant lack of resource awareness and adaptability, shedding light on future development of resource-efficient collaborative systems. Our code and data are openly available on our project website: https://xhguo7.github.io/SyncMind/.

Subject: ICML.2025 - Poster