TestForge: Feedback-Driven, Agentic Test Suite Generation | Cool Papers

#1 TestForge: Feedback-Driven, Agentic Test Suite Generation [PDF] [Copy] [Kimi¹] [REL]

Automated test generation holds great promise for alleviating the burdens of manual test creation. However, existing search-based techniques compromise on test readability, while LLM-based approaches are prohibitively expensive in practice. We present TestForge, an agentic unit testing framework designed to cost-effectively generate high-quality test suites for real-world code. Our key insight is to reframe LLM-based test generation as an iterative process. TestForge thus begins with tests generated via zero-shot prompting, and then continuously refines those tests based on feedback from test executions and coverage reports. We evaluate TestForge on TestGenEval, a real world unit test generation benchmark sourced from 11 large scale open source repositories; we show that TestForge achieves a pass@1 rate of 84.3%, 44.4% line coverage and 33.8% mutation score on average, outperforming prior classical approaches and a one-iteration LLM-based baseline. TestForge produces more natural and understandable tests compared to state-of-the-art search-based techniques, and offers substantial cost savings over LLM-based techniques (at $0.63 per file). Finally, we release a version of TestGenEval integrated with the OpenHands platform, a popular open-source framework featuring a diverse set of software engineering agents and agentic benchmarks, for future extension and development.

Subject: Software Engineering

Publish: 2025-03-18 20:21:44 UTC

2503.14713

#1 TestForge: Feedback-Driven, Agentic Test Suite Generation [PDF] [Copy] [Kimi1] [REL]

#1 TestForge: Feedback-Driven, Agentic Test Suite Generation [PDF] [Copy] [Kimi¹] [REL]