Related papers: Deep Bilevel Learning

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

The (gradient-based) bilevel programming framework is widely used in hyperparameter optimization and has achieved excellent performance empirically. Previous theoretical work mainly focuses on its optimization properties, while leaving the…

Machine Learning · Computer Science 2021-10-26 Fan Bao , Guoqiang Wu , Chongxuan Li , Jun Zhu , Bo Zhang

Cross-regularization: Adaptive Model Complexity through Validation Gradients

Model regularization requires extensive manual tuning to balance complexity against overfitting. Cross-regularization resolves this tradeoff by directly adapting regularization parameters through validation gradients during training. The…

Machine Learning · Computer Science 2025-06-25 Carlos Stein Brito

Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization

Overfitting is one of the most critical challenges in deep neural networks, and there are various types of regularization methods to improve generalization performance. Injecting noises to hidden units during training, e.g., dropout, is…

Machine Learning · Computer Science 2017-11-10 Hyeonwoo Noh , Tackgeun You , Jonghwan Mun , Bohyung Han

Regularization of Inverse Problems: Deep Equilibrium Models versus Bilevel Learning

Variational regularization methods are commonly used to approximate solutions of inverse problems. In recent years, model-based variational regularization methods have often been replaced with data-driven ones such as the fields-of-expert…

Optimization and Control · Mathematics 2024-01-17 Danilo Riccio , Matthias J. Ehrhardt , Martin Benning

Learning to Reweight Examples for Robust Deep Learning

Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to…

Machine Learning · Computer Science 2019-05-07 Mengye Ren , Wenyuan Zeng , Bin Yang , Raquel Urtasun

Multilevel-in-Layer Training for Deep Neural Network Regression

A common challenge in regression is that for many problems, the degrees of freedom required for a high-quality solution also allows for overfitting. Regularization is a class of strategies that seek to restrict the range of possible…

Machine Learning · Computer Science 2022-11-15 Colin Ponce , Ruipeng Li , Christina Mao , Panayot Vassilevski

Bilevel learning of regularization models and their discretization for image deblurring and super-resolution

Bilevel learning is a powerful optimization technique that has extensively been employed in recent years to bridge the world of model-driven variational approaches with data-driven methods. Upon suitable parametrization of the desired…

Numerical Analysis · Mathematics 2023-10-30 Tatiana A. Bubba , Luca Calatroni , Ambra Catozzi , Serena Crisci , Thomas Pock , Monica Pragliola , Siiri Rautio , Danilo Riccio , Andrea Sebastiani

Training Deep Neural Networks Without Batch Normalization

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other…

Machine Learning · Computer Science 2020-10-06 Wei Hu , Zhiyuan Li , Dingli Yu

Bilevel Learning via Inexact Stochastic Gradient Descent

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward…

Optimization and Control · Mathematics 2025-11-11 Mohammad Sadegh Salehi , Subhadip Mukherjee , Lindon Roberts , Matthias J. Ehrhardt

LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization

Regularization techniques are crucial to improving the generalization performance and training efficiency of deep neural networks. Many deep learning algorithms rely on weight decay, dropout, batch/layer normalization to converge faster and…

Machine Learning · Computer Science 2025-05-23 Peng Lu , Ahmad Rashid , Ivan Kobyzev , Mehdi Rezagholizadeh , Philippe Langlais

Variational Deep Learning via Implicit Regularization

Modern deep learning models generalize remarkably well in-distribution, despite being overparametrized and trained with little to no explicit regularization. Instead, current theory credits implicit regularization imposed by the choice of…

Machine Learning · Computer Science 2026-03-17 Jonathan Wenger , Beau Coker , Juraj Marusic , John P. Cunningham

Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters

Hyperparameter selection generally relies on running multiple full training trials, with selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model.…

Machine Learning · Computer Science 2016-06-20 Jelena Luketina , Mathias Berglund , Klaus Greff , Tapani Raiko

Robust Sampling in Deep Learning

Deep learning requires regularization mechanisms to reduce overfitting and improve generalization. We address this problem by a new regularization method based on distributional robust optimization. The key idea is to modify the…

Machine Learning · Computer Science 2020-06-08 Aurora Cobo Aguilera , Antonio Artés-Rodríguez , Fernando Pérez-Cruz , Pablo Martínez Olmos

Regularizing Meta-Learning via Gradient Dropout

With the growing attention on learning-to-learn new tasks using only a few examples, meta-learning has been widely used in numerous problems such as few-shot classification, reinforcement learning, and domain generalization. However,…

Computer Vision and Pattern Recognition · Computer Science 2020-04-14 Hung-Yu Tseng , Yi-Wen Chen , Yi-Hsuan Tsai , Sifei Liu , Yen-Yu Lin , Ming-Hsuan Yang

Model Debiasing by Learnable Data Augmentation

Deep Neural Networks are well known for efficiently fitting training data, yet experiencing poor generalization capabilities whenever some kind of bias dominates over the actual task labels, resulting in models learning "shortcuts". In…

Machine Learning · Computer Science 2024-08-12 Pietro Morerio , Ruggero Ragonesi , Vittorio Murino

DL-Reg: A Deep Learning Regularization Technique using Linear Regression

Regularization plays a vital role in the context of deep learning by preventing deep neural networks from the danger of overfitting. This paper proposes a novel deep learning regularization method named as DL-Reg, which carefully reduces…

Machine Learning · Computer Science 2020-11-05 Maryam Dialameh , Ali Hamzeh , Hossein Rahmani

Learning Sparsity-Promoting Regularizers using Bilevel Optimization

We present a method for supervised learning of sparsity-promoting regularizers for denoising signals and images. Sparsity-promoting regularization is a key ingredient in solving modern signal reconstruction problems; however, the operators…

Machine Learning · Computer Science 2023-09-07 Avrajit Ghosh , Michael T. McCann , Madeline Mitchell , Saiprasad Ravishankar

Learning Compact Neural Networks with Regularization

Proper regularization is critical for speeding up training, improving generalization performance, and learning compact models that are cost efficient. We propose and analyze regularized gradient descent algorithms for learning shallow…

Machine Learning · Computer Science 2018-06-08 Samet Oymak

Gradient Regularization Improves Accuracy of Discriminative Models

Regularizing the gradient norm of the output of a neural network with respect to its inputs is a powerful technique, rediscovered several times. This paper presents evidence that gradient regularization can consistently improve…

Machine Learning · Computer Science 2018-05-28 Dániel Varga , Adrián Csiszárik , Zsolt Zombori