English
Related papers

Related papers: Generalization Error Bounds for Optimization Algor…

200 papers

The success of deep learning has led to a rising interest in the generalization property of the stochastic gradient descent (SGD) method, and stability is one popular approach to study it. Existing works based on stability have studied…

Machine Learning · Statistics 2019-03-08 Yi Zhou , Yingbin Liang , Huishuai Zhang

Recently there is a large amount of work devoted to the study of Markov chain stochastic gradient methods (MC-SGMs) which mainly focus on their convergence analysis for solving minimization problems. In this paper, we provide a…

Machine Learning · Statistics 2022-09-19 Puyu Wang , Yunwen Lei , Yiming Ying , Ding-Xuan Zhou

This work studies the generalization error of gradient methods. More specifically, we focus on how training steps $T$ and step-size $\eta$ might affect generalization in smooth stochastic convex optimization (SCO) problems. We first provide…

Machine Learning · Computer Science 2023-05-11 Peiyuan Zhang , Jiaye Teng , Jingzhao Zhang

In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data. Recent work [Xu and Raginsky (2017)] has established a bound on the…

Machine Learning · Computer Science 2018-01-16 Ankit Pensia , Varun Jog , Po-Ling Loh

Variance reduction (VR) methods employ stochastic gradients with decreasing variance, and they have been widely applied to solve large-scale optimization problems in machine learning because of their efficiency. Existing theoretical studies…

Machine Learning · Computer Science 2026-05-28 Yunwen Lei , Zimeng Wang , Xiaoming Yuan

In this paper we study the problem of convergence and generalization error bound of stochastic momentum for deep learning from the perspective of regularization. To do so, we first interpret momentum as solving an $\ell_2$-regularized…

Machine Learning · Computer Science 2019-06-04 Ziming Zhang , Wenju Xu , Alan Sullivan

We give a new separation result between the generalization performance of stochastic gradient descent (SGD) and of full-batch gradient descent (GD) in the fundamental stochastic convex optimization model. While for SGD it is well-known that…

Machine Learning · Computer Science 2021-07-01 Idan Amir , Tomer Koren , Roi Livni

Generalization error (also known as the out-of-sample error) measures how well the hypothesis learned from training data generalizes to previously unseen data. Proving tight generalization error bounds is a central question in statistical…

Machine Learning · Computer Science 2020-03-03 Jian Li , Xuanyuan Luo , Mingda Qiao

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for…

Machine Learning · Computer Science 2018-02-19 Ilja Kuzborskij , Christoph H. Lampert

Generalization error bounds for deep neural networks trained by stochastic gradient descent (SGD) are derived by combining a dynamical control of an appropriate parameter norm and the Rademacher complexity estimate based on parameter norms.…

Machine Learning · Computer Science 2023-05-30 Mingze Wang , Chao Ma

We analyze the sample complexity of full-batch Gradient Descent (GD) in the setup of non-smooth Stochastic Convex Optimization. We show that the generalization error of GD, with common choice of hyper-parameters, can be $\tilde \Theta(d/m +…

Machine Learning · Computer Science 2024-04-12 Roi Livni

We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit to training data. We consider the fundamental stochastic convex…

Machine Learning · Computer Science 2023-01-13 Tomer Koren , Roi Livni , Yishay Mansour , Uri Sherman

Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs), AUC maximization and robust estimation, to mention but a few. A substantial amount of studies are devoted to studying the…

Machine Learning · Computer Science 2021-07-14 Yunwen Lei , Zhenhuan Yang , Tianbao Yang , Yiming Ying

We provide sharp path-dependent generalization and excess risk guarantees for the full-batch Gradient Descent (GD) algorithm on smooth losses (possibly non-Lipschitz, possibly nonconvex). At the heart of our analysis is an upper bound on…

Machine Learning · Statistics 2023-02-13 Konstantinos E. Nikolakakis , Farzin Haddadpour , Amin Karbasi , Dionysios S. Kalogerias

Empirical Risk Minimization (ERM) algorithms are widely used in a variety of estimation and prediction tasks in signal-processing and machine learning applications. Despite their popularity, a theory that explains their statistical…

Machine Learning · Statistics 2020-07-07 Hossein Taheri , Ramtin Pedarsani , Christos Thrampoulidis

Exponential generalization bounds with near-tight rates have recently been established for uniformly stable learning algorithms. The notion of uniform stability, however, is stringent in the sense that it is invariant to the data-generating…

Machine Learning · Statistics 2022-06-09 Xiao-Tong Yuan , Ping Li

Recently there are a considerable amount of work devoted to the study of the algorithmic stability and generalization for stochastic gradient descent (SGD). However, the existing stability analysis requires to impose restrictive assumptions…

Machine Learning · Computer Science 2020-06-16 Yunwen Lei , Yiming Ying

Algorithmic stability is a classical approach to understanding and analysis of the generalization error of learning algorithms. A notable weakness of most stability-based generalization bounds is that they hold only in expectation.…

Machine Learning · Computer Science 2019-06-25 Vitaly Feldman , Jan Vondrak

The $\ell_0$-constrained empirical risk minimization ($\ell_0$-ERM) is a promising tool for high-dimensional statistical estimation. The existing analysis of $\ell_0$-ERM estimator is mostly on parameter estimation and support recovery…

Statistics Theory · Mathematics 2020-01-22 Xiao-Tong Yuan , Ping Li

Algorithm-dependent generalization error bounds are central to statistical learning theory. A learning algorithm may use a large hypothesis space, but the limited number of iterations controls its model capacity and generalization error.…

Machine Learning · Computer Science 2017-07-20 Wenlong Mou , Liwei Wang , Xiyu Zhai , Kai Zheng
‹ Prev 1 2 3 10 Next ›