Related papers: Optimizing generalization on the train set: a nove…

On improving generalization in a class of learning problems with the method of small parameters for weakly-controlled optimal gradient systems

In this paper, we provide a mathematical framework for improving generalization in a class of learning problems which is related to point estimations for modeling of high-dimensional nonlinear functions. In particular, we consider a…

Optimization and Control · Mathematics 2024-12-13 Getachew K. Befekadu

From Optimization Dynamics to Generalization Bounds via {\L}ojasiewicz Gradient Inequality

Optimization and generalization are two essential aspects of statistical machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the optimization…

Machine Learning · Statistics 2022-10-13 Fusheng Liu , Haizhao Yang , Soufiane Hayou , Qianxiao Li

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, especially for severely overparameterized networks nowadays. In this paper, we propose an effective method to improve the model…

Machine Learning · Computer Science 2022-06-28 Yang Zhao , Hao Zhang , Xiuyuan Hu

Muddling Labels for Regularization, a novel approach to generalization

Generalization is a central problem in Machine Learning. Indeed most prediction methods require careful calibration of hyperparameters usually carried out on a hold-out \textit{validation} dataset to achieve generalization. The main goal of…

Machine Learning · Statistics 2021-02-18 Karim Lounici , Katia Meziani , Benjamin Riu

Learning Acceleration Algorithms for Fast Parametric Convex Optimization with Certified Robustness

We develop a machine-learning framework to learn hyperparameter sequences for accelerated first-order methods (e.g., the step size and momentum sequences in accelerated gradient descent) to quickly solve parametric convex optimization…

Optimization and Control · Mathematics 2025-10-07 Rajiv Sambharya , Jinho Bok , Nikolai Matni , George Pappas

Gradient-based Regularization Parameter Selection for Problems with Non-smooth Penalty Functions

In high-dimensional and/or non-parametric regression problems, regularization (or penalization) is used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure…

Machine Learning · Statistics 2017-03-30 Jean Feng , Noah Simon

Self-Regularized Learning Methods

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations…

Machine Learning · Statistics 2026-03-19 Max Schölpple , Liu Fanghui , Ingo Steinwart

Gradient Centralization: A New Optimization Technique for Deep Neural Networks

Optimization techniques are of great importance to effectively and efficiently train a deep neural network (DNN). It has been shown that using the first and second order statistics (e.g., mean and variance) to perform Z-score…

Computer Vision and Pattern Recognition · Computer Science 2020-04-09 Hongwei Yong , Jianqiang Huang , Xiansheng Hua , Lei Zhang

Gradient Agreement as an Optimization Objective for Meta-Learning

This paper presents a novel optimization method for maximizing generalization over tasks in meta-learning. The goal of meta-learning is to learn a model for an agent adapting rapidly when presented with previously unseen tasks. Tasks are…

Machine Learning · Computer Science 2018-10-19 Amir Erfan Eshratifar , David Eigen , Massoud Pedram

Deep Bilevel Learning

We present a novel regularization approach to train neural networks that enjoys better generalization and test error than standard stochastic gradient descent. Our approach is based on the principles of cross-validation, where a validation…

Computer Vision and Pattern Recognition · Computer Science 2018-09-06 Simon Jenni , Paolo Favaro

Optimization for Supervised Machine Learning: Randomized Algorithms for Data and Parameters

Many key problems in machine learning and data science are routinely modeled as optimization problems and solved via optimization algorithms. With the increase of the volume of data and the size and complexity of the statistical models used…

Optimization and Control · Mathematics 2020-08-28 Filip Hanzely

Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

Hyperparameter selection generally relies on running multiple full training trials, with selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model.…

Machine Learning · Computer Science 2016-06-20 Jelena Luketina , Mathias Berglund , Klaus Greff , Tapani Raiko

An Improved Analysis of Training Over-parameterized Deep Neural Networks

A recent line of research has shown that gradient-based algorithms with random initialization can converge to the global minima of the training loss for over-parameterized (i.e., sufficiently wide) deep neural networks. However, the…

Machine Learning · Computer Science 2019-06-12 Difan Zou , Quanquan Gu

Optimisation & Generalisation in Networks of Neurons

The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks. On optimisation, a new theoretical framework is proposed for deriving architecture-dependent…

Neural and Evolutionary Computing · Computer Science 2022-10-20 Jeremy Bernstein

A Generalizable Approach to Learning Optimizers

A core issue with learning to optimize neural networks has been the lack of generalization to real world problems. To address this, we describe a system designed from a generalization-first perspective, learning to update optimizer…

Machine Learning · Computer Science 2021-06-09 Diogo Almeida , Clemens Winter , Jie Tang , Wojciech Zaremba

Domain Generalization via Pareto Optimal Gradient Matching

In this study, we address the gradient-based domain generalization problem, where predictors aim for consistent gradient directions across different domains. Existing methods have two main challenges. First, minimization of gradient…

Machine Learning · Computer Science 2025-07-22 Khoi Do , Duong Nguyen , Nam-Khanh Le , Quoc-Viet Pham , Binh-Son Hua , Won-Joo Hwang

Learning Hard Optimization Problems: A Data Generation Perspective

Optimization problems are ubiquitous in our societies and are present in almost every segment of the economy. Most of these optimization problems are NP-hard and computationally demanding, often requiring approximate solutions for…

Optimization and Control · Mathematics 2021-06-23 James Kotary , Ferdinando Fioretto , Pascal Van Hentenryck

Towards Data-Algorithm Dependent Generalization: a Case Study on Overparameterized Linear Regression

One of the major open problems in machine learning is to characterize generalization in the overparameterized regime, where most traditional generalization bounds become inconsistent even for overparameterized linear regression. In many…

Machine Learning · Computer Science 2023-11-22 Jing Xu , Jiaye Teng , Yang Yuan , Andrew Chi-Chih Yao

A Novel DNN Training Framework via Data Sampling and Multi-Task Optimization

Conventional DNN training paradigms typically rely on one training set and one validation set, obtained by partitioning an annotated dataset used for training, namely gross training set, in a certain way. The training set is used for…

Neural and Evolutionary Computing · Computer Science 2020-07-03 Boyu Zhang , A. K. Qin , Hong Pan , Timos Sellis

Train simultaneously, generalize better: Stability of gradient-based minimax learners

The success of minimax learning problems of generative adversarial networks (GANs) has been observed to depend on the minimax optimization algorithm used for their training. This dependence is commonly attributed to the convergence speed…

Machine Learning · Computer Science 2020-10-26 Farzan Farnia , Asuman Ozdaglar