PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

Minghao Yan; Bo Peng; Benjamin Coleman; Ziqi Chen; Zhouhang Xie; Shuo Chen; Zhankui He; Noveen Sachdeva; Weili Wang; Ed H. Chi; Shivaram Venkataraman; Wang-Cheng Kang; Derek Zhiyuan Cheng; Beidou Wang

PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

Machine Learning 2026-05-11 v1

Authors: Minghao Yan , Bo Peng , Benjamin Coleman , Ziqi Chen , Zhouhang Xie , Shuo Chen , Zhankui He , Noveen Sachdeva , Weili Wang , Ed H. Chi , Shivaram Venkataraman , Wang-Cheng Kang , Derek Zhiyuan Cheng , Beidou Wang

View on arXiv ↗ PDF ↗

Abstract

Large language models have become drivers of evolutionary search, but most systems rely on a fixed, prompt-elicited policy to sample next candidates. This limits adaptation in practical engineering and research tasks, where evaluations are expensive, and progress depends on learning task-specific search dynamics. We introduce PACEvolve++, an advisor-model reinforcement learning framework for test-time policy adaptation in evolutionary search agents. PACEvolve++ decouples strategic search decisions from implementation: a trainable advisor generates, assesses, and selects hypotheses, while a stronger frontier model translates selected hypotheses into executable candidates. To train the advisor under non-stationary feedback, we propose a phase-adaptive approach that adapts its optimization strategy to different phases of the evolutionary process. Early in evolution, it uses group-relative feedback to learn broad search preferences; later, as reward gaps compress, it emphasizes best-of- $k$ frontier contribution to support stable refinement. Across expert-parallel load balancing, sequential recommendation, and protein fitness extrapolation, PACEvolve++ outperforms the state-of-the-art evolutionary search framework with frontier models, achieving faster convergence and stabilizing test-time training during evolutionary search.

Keywords

evolutionary optimization evolutionary algorithm reinforcement learning

Cite

@article{arxiv.2605.07039,
  title  = {PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents},
  author = {Minghao Yan and Bo Peng and Benjamin Coleman and Ziqi Chen and Zhouhang Xie and Shuo Chen and Zhankui He and Noveen Sachdeva and Weili Wang and Ed H. Chi and Shivaram Venkataraman and Wang-Cheng Kang and Derek Zhiyuan Cheng and Beidou Wang},
  journal= {arXiv preprint arXiv:2605.07039},
  year   = {2026}
}

PACEvolve++: Improving Test-time Learning for Evolutionary Search Agents

Abstract

Keywords

Cite

Related papers