Related papers: Minimax Efficient Finite-Difference Stochastic Gra…

Adaptive Finite-Difference Interval Estimation for Noisy Derivative-Free Optimization

A common approach for minimizing a smooth nonlinear function is to employ finite-difference approximations to the gradient. While this can be easily performed when no error is present within the function evaluations, when the function is…

Optimization and Control · Mathematics 2022-03-24 Hao-Jun Michael Shi , Yuchen Xie , Melody Qiming Xuan , Jorge Nocedal

On the Convergence and Complexity of the Stochastic Central Finite-Difference Based Gradient Estimation Methods

This paper presents an algorithmic framework for solving unconstrained stochastic optimization problems using only stochastic function evaluations. We employ central finite-difference based gradient estimation methods to approximate the…

Optimization and Control · Mathematics 2025-01-14 Raghu Bollapragada , Cem Karamanli

Information-Theoretic Lower Bounds for Zero-Order Stochastic Gradient Estimation

In this paper we analyze the necessary number of samples to estimate the gradient of any multidimensional smooth (possibly non-convex) function in a zero-order stochastic oracle model. In this model, an estimator has access to noisy values…

Machine Learning · Computer Science 2021-07-07 Abdulrahman Alabdulkareem , Jean Honorio

An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization

Finite-difference methods are a class of algorithms designed to solve black-box optimization problems by approximating a gradient of the target function on a set of directions. In black-box optimization, the non-smooth setting is…

Optimization and Control · Mathematics 2023-11-07 Marco Rando , Cesare Molinari , Lorenzo Rosasco , Silvia Villa

Distributionally Constrained Black-Box Stochastic Gradient Estimation and Optimization

We consider stochastic gradient estimation using only black-box function evaluations, where the function argument lies within a probability simplex. This problem is motivated from gradient-descent optimization procedures in multiple…

Optimization and Control · Mathematics 2021-05-20 Henry Lam , Junhui Zhang

Mixed Finite Differences Scheme for Gradient Approximation

In this paper we focus on the linear functionals defining an approximate version of the gradient of a function. These functionals are often used when dealing with optimization problems where the computation of the gradient of the objective…

Optimization and Control · Mathematics 2021-05-21 Marco Boresta , Tommaso Colombo , Alberto De Santis , Stefano Lucidi

Generalizing Stochastic Smoothing for Differentiation and Gradient Estimation

We deal with the problem of gradient estimation for stochastic differentiable relaxations of algorithms, operators, simulators, and other non-differentiable functions. Stochastic smoothing conventionally perturbs the input of a…

Machine Learning · Computer Science 2024-10-11 Felix Petersen , Christian Borgelt , Aashwin Mishra , Stefano Ermon

Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework

In this work, we focus on the study of stochastic zeroth-order (ZO) optimization which does not require first-order gradient information and uses only function evaluations. The problem of ZO optimization has emerged in many recent machine…

Machine Learning · Statistics 2020-12-22 Pranay Sharma , Kaidi Xu , Sijia Liu , Pin-Yu Chen , Xue Lin , Pramod K. Varshney

Super-efficiency of automatic differentiation for functions defined as a minimum

In min-min optimization or max-min optimization, one has to compute the gradient of a function defined as a minimum. In most cases, the minimum has no closed-form, and an approximation is obtained via an iterative algorithm. There are two…

Machine Learning · Statistics 2020-02-11 Pierre Ablin , Gabriel Peyré , Thomas Moreau

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still…

Machine Learning · Computer Science 2018-02-27 Will Grathwohl , Dami Choi , Yuhuai Wu , Geoffrey Roeder , David Duvenaud

Black-box Optimizer with Implicit Natural Gradient

Black-box optimization is primarily important for many compute-intensive applications, including reinforcement learning (RL), robot control, etc. This paper presents a novel theoretical framework for black-box optimization, in which our…

Machine Learning · Computer Science 2020-09-10 Yueming Lyu , Ivor W. Tsang

Zeroth-order gradient estimators for stochastic problems with decision-dependent distributions

Stochastic optimization problems with unknown decision-dependent distributions have attracted increasing attention in recent years due to its importance in applications. Since the gradient of the objective function is inaccessible as a…

Optimization and Control · Mathematics 2025-10-30 Yuya Hikima , Akiko Takeda

Sequential stochastic blackbox optimization with zeroth-order gradient estimators

This work considers stochastic optimization problems in which the objective function values can only be computed by a blackbox corrupted by some random noise following an unknown distribution. The proposed method is based on sequential…

Optimization and Control · Mathematics 2023-08-15 Charles Audet , Jean Bigeon , Romain Couderc , Michael Kokkolaras

SAGE: A Set-based Adaptive Gradient Estimator

A new paradigm to estimate the gradient of a black-box scalar function is introduced, considering it as a member of a set of admissible gradients that are computed using existing function samples. Results on gradient estimate accuracy,…

Optimization and Control · Mathematics 2025-08-28 Lorenzo Sabug , Fredy Ruiz , Lorenzo Fagiano

Provable convergence guarantees for black-box variational inference

Black-box variational inference is widely used in situations where there is no proof that its stochastic optimization succeeds. We suggest this is due to a theoretical gap in existing stochastic optimization proofs: namely the challenge of…

Machine Learning · Computer Science 2023-12-25 Justin Domke , Guillaume Garrigos , Robert Gower

Improved Gradient-Based Optimization Over Discrete Distributions

In many applications we seek to maximize an expectation with respect to a distribution over discrete variables. Estimating gradients of such objectives with respect to the distribution parameters is a challenging problem. We analyze…

Machine Learning · Statistics 2019-06-18 Evgeny Andriyash , Arash Vahdat , Bill Macready

A Stochastic Gradient Descent Method for Globally Minimizing Nearly Convex Functions

This paper proposes a stochastic gradient descent method with an adaptive Gaussian noise term for the global minimization of nearly convex functions, which are nonconvex and possess multiple strict local minimizers. The noise term,…

Optimization and Control · Mathematics 2025-08-05 Chenglong Bao , Liang Chen , Weizhi Shao

A Correlation-induced Finite Difference Estimator

Finite difference (FD) approximation is a classic approach to stochastic gradient estimation when only noisy function realizations are available. In this paper, we first provide a sample-driven method via the bootstrap technique to estimate…

Methodology · Statistics 2024-08-21 Guo Liang , Guangwu Liu , Kun Zhang

Provable Gradient Variance Guarantees for Black-Box Variational Inference

Recent variational inference methods use stochastic gradient estimators whose variance is not well understood. Theoretical guarantees for these estimators are important to understand when these methods will or will not work. This paper…

Machine Learning · Computer Science 2019-10-29 Justin Domke

Stein-Rule Shrinkage for Stochastic Gradient Estimation in High Dimensions

Stochastic gradient methods are central to large-scale learning, but they treat mini-batch gradients as unbiased estimators, which classical decision theory shows are inadmissible in high dimensions. We formulate gradient computation as a…

Machine Learning · Computer Science 2026-02-10 M. Arashi , M. Amintoosi