HomeComputation & LanguagearXiv:2605.29682

Scaling Laws for Agent Harnesses via Effective Feedback Compute

Computation & Language2026-05v1license

Abstract

Agent harnesses increasingly determine the performance of language-model systems by deciding how models call tools, receive feedback, verify intermediate states, store memory, and revise solutions. Yet current test-time scaling analyses often parameterize this process by raw expenditure -- tokens, tool calls, operations, wall time, or cost -- which does not distinguish useful feedback from redundant or unstable interaction. We introduce \emph{Effective Feedback Compute} (EFC), a trace-level scaling coordinate that credits feedback only when it is informative, valid, non-redundant, and retained for subsequent decisions, and we normalize it by task demand when comparing tasks with different feedback requirements. Across synthetic controllable tasks, executable code tasks, real benchmark traces, held-out splits, and a prospective validation batch, EFC-based coordinates consistently predict failure rates better than raw-compute baselines and a strong multivariate SAS baseline. In controlled scaling, raw tokens and tool calls explain limited variation (R2=0.33R^2=0.33 and 0.420.42), SAS reaches 0.880.88, while Oracle-EFC and Estimated-EFC reach 0.940.94 and Oracle-EFC/DtaskD_{\mathrm{task}} reaches 0.990.99. Matched-budget interventions show that improving feedback quality raises success from 0.270.27 to 0.900.90 while raw cost and tool calls are fixed. On mixed real traces, NRS-EFC/DtaskD_{\mathrm{task}} reaches R2=0.92R^2=0.92 while raw compute has near-zero or negative fit, and it remains the best predictor in a prospective holdout (R2=0.85R^2=0.85). These results suggest that harness scaling is governed less by how much computation is spent than by how efficiently raw budget is converted into durable, task-sufficient feedback.

Cite

@article{arxiv.2605.29682,
  title  = {Scaling Laws for Agent Harnesses via Effective Feedback Compute},
  author = {Xuanliang Zhang and Dingzirui Wang and Keyan Xu and Qingfu Zhu and Wanxiang Che},
  journal= {arXiv preprint arXiv:2605.29682},
  year   = {2026}
}