$Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation

#1 $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation [PDF³⁰] [Copy] [Kimi⁴⁵] [REL]

Authors: Yuan Wei, Xiaohan Shan, Ran Miao, Jianmin Li

Reinforcement learning (RL) agent development traditionally requires substantial expertise and iterative effort, often leading to high failure rates and limited accessibility. This paper introduces Agent$^2$, an LLM-driven agent-generates-agent framework for fully automated RL agent design. Agent$^2$ autonomously translates natural language task descriptions and environment code into executable RL solutions without human intervention. The framework adopts a dual-agent architecture: a Generator Agent that analyzes tasks and designs agents, and a Target Agent that is automatically generated and executed. To better support automation, RL development is decomposed into two stages, MDP modeling and algorithmic optimization, facilitating targeted and effective agent generation. Built on the Model Context Protocol, Agent$^2$ provides a unified framework for standardized agent creation across diverse environments and algorithms, incorporating adaptive training management and intelligent feedback analysis for continuous refinement. Extensive experiments on benchmarks including MuJoCo, MetaDrive, MPE, and SMAC show that Agent$^2$ outperforms manually designed baselines across all tasks, achieving up to 55\% performance improvement with consistent average gains. By enabling a closed-loop, end-to-end automation pipeline, this work advances a new paradigm in which agents can design and optimize other agents, underscoring the potential of agent-generates-agent systems for automated AI development.

Subjects: Artificial Intelligence , Machine Learning

Publish: 2025-09-16 02:14:39 UTC

2509.13368

#1 $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation [PDF30] [Copy] [Kimi45] [REL]

#1 $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation [PDF³⁰] [Copy] [Kimi⁴⁵] [REL]