Small errors in random zeroth-order optimization are imaginary
Abstract
Most zeroth-order optimization algorithms mimic a first-order algorithm but replace the gradient of the objective function with some gradient estimator that can be computed from a small number of function evaluations. This estimator is constructed randomly, and its expectation matches the gradient of a smooth approximation of the objective function whose quality improves as the underlying smoothing parameter is reduced. Gradient estimators requiring a smaller number of function evaluations are preferable from a computational point of view. While estimators based on a single function evaluation can be obtained by use of the divergence theorem from vector calculus, their variance explodes as tends to . Estimators based on multiple function evaluations, on the other hand, suffer from numerical cancellation when tends to . To combat both effects simultaneously, we extend the objective function to the complex domain and construct a gradient estimator that evaluates the objective at a complex point whose coordinates have small imaginary parts of the order . As this estimator requires only one function evaluation, it is immune to cancellation. In addition, its variance remains bounded as tends to . We prove that zeroth-order algorithms that use our estimator offer the same theoretical convergence guarantees as the state-of-the-art methods. Numerical experiments suggest, however, that they often converge faster in practice.
Cite
@article{arxiv.2103.05478,
title = {Small errors in random zeroth-order optimization are imaginary},
author = {Wouter Jongeneel and Man-Chung Yue and Daniel Kuhn},
journal= {arXiv preprint arXiv:2103.05478},
year = {2026}
}
Comments
Final version (33 pages), to appear in the SIAM Journal on Optimization