Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems

#1 Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems [PDF] [Copy] [Kimi] [REL]

Authors: YUTONG WU, Jie Zhang, Yiming Li, Chao Zhang, Qing Guo, Han Qiu, Nils Lukas, Tianwei Zhang

Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language.Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property is **robustness**, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread and **infect** other agents to undermine the entire system's integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves the **recovery rate** of agents by limiting the expected number of infections to other agents.The core idea is to generate and distribute a special *cure sample* that immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.

Subject: ICML.2025 - Poster

5KszXnnkG5@OpenReview

#1 Cowpox: Towards the Immunity of VLM-based Multi-Agent Systems [PDF] [Copy] [Kimi] [REL]