HARBOR: Automated Harness Optimization

#1 HARBOR: Automated Harness Optimization [PDF¹] [Copy] [Kimi¹] [REL]

Authors: Biswa Sengupta, Jinhua Wang

Long-horizon language-model agents are dominated, in lines of code and in operational complexity, not by their underlying model but by the harness that wraps it: context compaction, tool caching, semantic memory, trajectory reuse, speculative tool prediction, and the glue that binds the model to a sandboxed execution environment. We argue that harness design is a first-class machine-learning problem and that automated configuration search dominates manual stacking once the flag space exceeds a handful of bits. We defend this claim in two steps. First, we formalize automated harness optimization as constrained noisy Bayesian optimization over a mixed-variable, cost-heterogeneous configuration space with cold-start-corrected rewards and a posterior chance-constrained safety check, and give a reference solver, HARBOR (Harness Axis-aligned Regularized Bayesian Optimization Routine), built from a block-additive SAAS surrogate, multi-fidelity cost-aware acquisition, and TuRBO trust regions. Second, we instantiate the problem in a flag-gated harness over a production coding agent and report a controlled four-round manual-tuning case study against a fixed task suite and an end-to-end HARBOR run. The formulation itself is task-class agnostic: the configuration space, reward correction, acquisition, and safety check apply to any agent harness with a bounded flag space and a reproducible task suite.

Subjects: Machine Learning , Artificial Intelligence

Publish: 2026-04-22 13:45:12 UTC

2604.20938

#1 HARBOR: Automated Harness Optimization [PDF1] [Copy] [Kimi1] [REL]

#1 HARBOR: Automated Harness Optimization [PDF¹] [Copy] [Kimi¹] [REL]