Related papers: Glocal Hypergradient Estimation with Koopman Opera…
In this paper, we propose a novel approach to solving optimization problems by reformulating the optimization problem into a dynamical system, followed by the adaptive spectral Koopman (ASK) method. The Koopman operator, employed in our…
Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we…
Hyperparameter selection generally relies on running multiple full training trials, with selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model.…
Hyperparameter optimization in machine learning is often achieved using naive techniques that only lead to an approximate set of hyperparameters. Although techniques such as Bayesian optimization perform an intelligent search on a given…
This work demonstrates the utility of gradients for the global optimization of certain differentiable functions with many suboptimal local minima. To this end, a principle for generating search directions from non-local quadratic…
Bilevel optimization is a powerful tool for many machine learning problems, such as hyperparameter optimization and meta-learning. Estimating hypergradients (also known as implicit gradients) is crucial for developing gradient-based methods…
Gradient-based hyperparameter optimization (HPO) have emerged recently, leveraging bilevel programming techniques to optimize hyperparameter by estimating hypergradient w.r.t. validation loss. Nevertheless, previous theoretical works mainly…
In the recent years, various gradient descent algorithms including the methods of gradient descent, gradient descent with momentum, adaptive gradient (AdaGrad), root-mean-square propagation (RMSProp) and adaptive moment estimation (Adam)…
This paper studies a class of distributed optimization problems with coupled equality constraints in networked systems. Many existing distributed algorithms rely on solving local subproblems via the $\operatorname{argmin}$ operator in each…
Gradient-based solvers risk convergence to local optima, leading to incorrect researcher inference. Heuristic-based algorithms are able to ``break free" of these local optima to eventually converge to the true global optimum. However, given…
This paper presents an interpretable machine learning approach that characterizes load dynamics within an operator-theoretic framework for electricity load forecasting in power grids. We represent the dynamics of load data using the Koopman…
We introduce Kalman Gradient Descent, a stochastic optimization algorithm that uses Kalman filtering to adaptively reduce gradient variance in stochastic gradient descent by filtering the gradient estimates. We present both a theoretical…
Gradient-based methods are widely used to solve various optimization problems, however, they are either constrained by local optima dilemmas, simple convex constraints, and continuous differentiability requirements, or limited to…
Evaluating the adversarial robustness of machine learning models using gradient-based attacks is challenging. In this work, we show that hyperparameter optimization can improve fast minimum-norm attacks by automating the selection of the…
Koopman analysis of a general dynamics system provides a linear Koopman operator and an embedded eigenfunction space, enabling the application of standard techniques from linear analysis. However, in practice, deriving exact operators and…
A recent novel extension of multi-output Gaussian processes handles heterogeneous outputs assuming that each output has its own likelihood function. It uses a vector-valued Gaussian process prior to jointly model all likelihoods' parameters…
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…
Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…
Koopman operator theory, a powerful framework for discovering the underlying dynamics of nonlinear dynamical systems, was recently shown to be intimately connected with neural network training. In this work, we take the first steps in…
Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as its step size. Recent work has shown how the step size can itself be optimized alongside the model…