English
Related papers

Related papers: Loss Gradient Gaussian Width based Generalization …

200 papers

The success of deep learning is due, to a large extent, to the remarkable effectiveness of gradient-based optimization methods applied to large neural networks. The purpose of this work is to propose a modern view and a general mathematical…

Machine Learning · Computer Science 2021-05-28 Chaoyue Liu , Libin Zhu , Mikhail Belkin

We establish novel generalization bounds for learning algorithms that converge to global minima. We do so by deriving black-box stability results that only depend on the convergence of a learning algorithm and the geometry around the…

Machine Learning · Statistics 2017-10-25 Zachary Charles , Dimitris Papailiopoulos

An influential line of recent work has focused on the generalization properties of unregularized gradient-based learning procedures applied to separable linear classification with exponentially-tailed loss functions. The ability of such…

Machine Learning · Computer Science 2022-06-24 Matan Schliserman , Tomer Koren

This paper addresses the study of derivative-free smooth optimization problems, where the gradient information on the objective function is unavailable. Two novel general derivative-free methods are proposed and developed for minimizing…

Optimization and Control · Mathematics 2023-11-29 Pham Duy Khanh , Boris S. Mordukhovich , Dat Ba Tran

Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the optimization…

Machine Learning · Statistics 2022-10-13 Fusheng Liu , Haizhao Yang , Soufiane Hayou , Qianxiao Li

We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-Lojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild…

Optimization and Control · Mathematics 2022-10-05 Ilyas Fatkhullin , Jalal Etesami , Niao He , Negar Kiyavash

We study the statistical and computational complexities of the Polyak step size gradient descent algorithm under generalized smoothness and Lojasiewicz conditions of the population loss function, namely, the limit of the empirical loss…

Machine Learning · Computer Science 2021-10-18 Tongzheng Ren , Fuheng Cui , Alexia Atsidakou , Sujay Sanghavi , Nhat Ho

We provide sharp path-dependent generalization and excess risk guarantees for the full-batch Gradient Descent (GD) algorithm on smooth losses (possibly non-Lipschitz, possibly nonconvex). At the heart of our analysis is an upper bound on…

Machine Learning · Statistics 2023-02-13 Konstantinos E. Nikolakakis , Farzin Haddadpour , Amin Karbasi , Dionysios S. Kalogerias

Many statistical $M$-estimators are based on convex optimization problems formed by the combination of a data-dependent loss function with a norm-based regularizer. We analyze the convergence rates of projected gradient and composite…

Machine Learning · Statistics 2012-07-26 Alekh Agarwal , Sahand N. Negahban , Martin J. Wainwright

Distributionally robust optimization has emerged as an attractive way to train robust machine learning models, capturing data uncertainty and distribution shifts. Recent statistical analyses have proved that generalization guarantees of…

Optimization and Control · Mathematics 2025-01-28 Tam Le , Jérôme Malick

We develop a general framework for proving rigorous guarantees on the performance of the EM algorithm and a variant known as gradient EM. Our analysis is divided into two parts: a treatment of these algorithms at the population level (in…

Statistics Theory · Mathematics 2014-08-12 Sivaraman Balakrishnan , Martin J. Wainwright , Bin Yu

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the…

Machine Learning · Computer Science 2018-01-16 Ankit Pensia , Varun Jog , Po-Ling Loh

Policy gradients methods apply to complex, poorly understood, control problems by performing stochastic gradient descent over a parameterized class of polices. Unfortunately, even for simple control problems solvable by standard dynamic…

Machine Learning · Computer Science 2022-06-22 Jalaj Bhandari , Daniel Russo

We study the problem of non-convex optimization using Stochastic Gradient Langevin Dynamics (SGLD). SGLD is a natural and popular variation of stochastic gradient descent where at each step, appropriately scaled Gaussian noise is added. To…

Machine Learning · Computer Science 2024-07-08 August Y. Chen , Ayush Sekhari , Karthik Sridharan

Proving algorithm-dependent generalization error bounds for gradient-type optimization methods has attracted significant attention recently in learning theory. However, most existing trajectory-based analyses require either restrictive…

Machine Learning · Computer Science 2022-10-12 Xuanyuan Luo , Luo Bei , Jian Li

We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss. This is a classical task in statistics,…

Optimization and Control · Mathematics 2026-02-02 Robert Gruhlke , Anthony Nouy , Philipp Trunschke

We address the problem of distributed convex unconstrained optimization over networks characterized by asynchronous and possibly lossy communications. We analyze the case where the global cost function is the sum of locally coupled local…

Optimization and Control · Mathematics 2020-10-06 Marco Todescato , Nicoletta Bof , Guido Cavraro , Ruggero Carli , Luca Schenato

Optimization is widely used in statistics, and often efficiently delivers point estimates on useful spaces involving structural constraints or combinatorial structure. To quantify uncertainty, Gibbs posterior exponentiates the negative loss…

Methodology · Statistics 2025-07-23 Cheng Zeng , Eleni Dilma , Jason Xu , Leo L Duan

A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks have made a theory of learning dynamics elusive. In this work, we…

The success of deep learning has led to a rising interest in the generalization property of the stochastic gradient descent (SGD) method, and stability is one popular approach to study it. Existing works based on stability have studied…

Machine Learning · Statistics 2019-03-08 Yi Zhou , Yingbin Liang , Huishuai Zhang
‹ Prev 1 2 3 10 Next ›