Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

#1 Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents [PDF] [Copy] [Kimi] [REL]

Authors: Ondřej Lukáš, Jihoon Shin, Emilia Rivas, Diego Forni, Maria Rigaki, Carlos Catania, Aritran Piplai, Christopher Kiekintveld, Sebastian Garcia

Autonomous offensive agents often fail to transfer beyond the networks on which they are trained. We isolate a minimal but fundamental shift -- unseen host/subnet IP reassignment in an otherwise fixed enterprise scenario -- and evaluate attacker generalization in the NetSecGame environment. Agents are trained on five IP-range variants and tested on a sixth unseen variant; only the meta-learning agent may adapt at test time. We compare three agent families (traditional RL, adaptation agents, and LLM-based agents) and use action-distribution-based behavioral/XAI analyses to localize failure modes. Some adaptation methods show partial transfer but significant degradation under unseen reassignment, indicating that even address-space changes can break long-horizon attack policies. Under our evaluation protocol and agent-specific assumptions, prompt-driven pretrained LLM agents achieve the highest success on the held-out reassignment, but at the cost of increased inference-time compute, reduced transparency, and practical failure modes such as repetition/invalid-action loops.

Subjects: Cryptography and Security , Machine Learning

Publish: 2026-03-06 22:24:37 UTC

2603.10041

#1 Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents [PDF] [Copy] [Kimi] [REL]