Related papers: Regularization and Optimal Multiclass Learning

Optimal Learners for Multiclass Problems

The fundamental theorem of statistical learning states that for binary classification problems, any Empirical Risk Minimization (ERM) learning rule has close to optimal sample complexity. In this paper we seek for a generic optimal learner…

Machine Learning · Computer Science 2014-05-13 Amit Daniely , Shai Shalev-Shwartz

Self-Regularized Learning Methods

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations…

Machine Learning · Statistics 2026-03-19 Max Schölpple , Liu Fanghui , Ingo Steinwart

Meta-Learned Invariant Risk Minimization

Empirical Risk Minimization (ERM) based machine learning algorithms have suffered from weak generalization performance on data obtained from out-of-distribution (OOD). To address this problem, Invariant Risk Minimization (IRM) objective was…

Machine Learning · Computer Science 2021-03-25 Jun-Hyun Bae , Inchul Choi , Minho Lee

Robust Empirical Risk Minimization with Tolerance

Developing simple, sample-efficient learning algorithms for robust classification is a pressing issue in today's tech-dominated world, and current theoretical techniques requiring exponential sample complexity and complicated improper…

Machine Learning · Computer Science 2023-02-07 Robi Bhattacharjee , Max Hopkins , Akash Kumar , Hantao Yu , Kamalika Chaudhuri

On the Efficiency of ERM in Feature Learning

Given a collection of feature maps indexed by a set $\mathcal{T}$, we study the performance of empirical risk minimization (ERM) on regression problems with square loss over the union of the linear classes induced by these feature maps.…

Machine Learning · Statistics 2024-11-20 Ayoub El Hanchi , Chris J. Maddison , Murat A. Erdogdu

A Statistical Theory of Regularization-Based Continual Learning

We provide a statistical analysis of regularization-based continual learning on a sequence of linear regression tasks, with emphasis on how different regularization terms affect the model performance. We first derive the convergence rate…

Machine Learning · Computer Science 2024-06-11 Xuyang Zhao , Huiyuan Wang , Weiran Huang , Wei Lin

Do highly over-parameterized neural networks generalize since bad solutions are rare?

We study over-parameterized classifiers where Empirical Risk Minimization (ERM) for learning leads to zero training error. In these over-parameterized settings there are many global minima with zero training error, some of which generalize…

Machine Learning · Computer Science 2023-12-05 Julius Martinetz , Thomas Martinetz

Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new…

Machine Learning · Computer Science 2025-12-02 Lama Alssum , Hasan Abed Al Kader Hammoud , Motasem Alfarra , Juan C Leon Alcazar , Bernard Ghanem

Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions

Empirical Risk Minimization (ERM) algorithms are widely used in a variety of estimation and prediction tasks in signal-processing and machine learning applications. Despite their popularity, a theory that explains their statistical…

Machine Learning · Statistics 2020-07-07 Hossein Taheri , Ramtin Pedarsani , Christos Thrampoulidis

Meta-Regularization by Enforcing Mutual-Exclusiveness

Meta-learning models have two objectives. First, they need to be able to make predictions over a range of task distributions while utilizing only a small amount of training data. Second, they also need to adapt to new novel unseen tasks at…

Machine Learning · Computer Science 2021-01-26 Edwin Pan , Pankaj Rajak , Shubham Shrivastava

Invariant Risk Minimization

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions. To achieve this goal, IRM learns a data representation such that the optimal classifier, on top…

Machine Learning · Statistics 2020-03-31 Martin Arjovsky , Léon Bottou , Ishaan Gulrajani , David Lopez-Paz

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i.e. functionals of the training data with low variance…

Machine Learning · Statistics 2019-01-25 Stéphan Clémençon , Aurélien Bellet , Igor Colin

Regularization Matters in Policy Optimization

Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., $L_2$…

Machine Learning · Computer Science 2021-11-30 Zhuang Liu , Xuanlin Li , Bingyi Kang , Trevor Darrell

Developing and Improving Risk Models using Machine-learning Based Algorithms

The objective of this study is to develop a good risk model for classifying business delinquency by simultaneously exploring several machine learning based methods including regularization, hyper-parameter optimization, and model ensembling…

Machine Learning · Computer Science 2020-10-13 Yan Wang , Xuelei Sherry Ni

Heterogeneous Risk Minimization

Machine learning algorithms with empirical risk minimization usually suffer from poor generalization performance due to the greedy exploitation of correlations among the training data, which are not stable under distributional shifts.…

Machine Learning · Computer Science 2021-06-18 Jiashuo Liu , Zheyuan Hu , Peng Cui , Bo Li , Zheyan Shen

On the regularized risk of distributionally robust learning over deep neural networks

In this paper we explore the relation between distributionally robust learning and different forms of regularization to enforce robustness of deep neural networks. In particular, starting from a concrete min-max distributionally robust…

Optimization and Control · Mathematics 2022-03-29 Camilo Garcia Trillos , Nicolas Garcia Trillos

Safe Grid Search with Optimal Complexity

Popular machine learning estimators involve regularization parameters that can be challenging to tune, and standard strategies rely on grid search for this task. In this paper, we revisit the techniques of approximating the regularization…

Machine Learning · Statistics 2019-05-28 Eugene Ndiaye , Tam Le , Olivier Fercoq , Joseph Salmon , Ichiro Takeuchi

An Analysis of Regularized Approaches for Constrained Machine Learning

Regularization-based approaches for injecting constraints in Machine Learning (ML) were introduced to improve a predictive model via expert knowledge. We tackle the issue of finding the right balance between the loss (the accuracy of the…

Machine Learning · Computer Science 2020-05-22 Michele Lombardi , Federico Baldo , Andrea Borghesi , Michela Milano

Learning Augmentation Distributions using Transformed Risk Minimization

We propose a new \emph{Transformed Risk Minimization} (TRM) framework as an extension of classical risk minimization. In TRM, we optimize not only over predictive models, but also over data transformations; specifically over distributions…

Machine Learning · Computer Science 2023-10-09 Evangelos Chatzipantazis , Stefanos Pertigkiozoglou , Kostas Daniilidis , Edgar Dobriban

Regularization Shortcomings for Continual Learning

In most machine learning algorithms, training data is assumed to be independent and identically distributed (iid). When it is not the case, the algorithm's performances are challenged, leading to the famous phenomenon of catastrophic…

Machine Learning · Computer Science 2021-04-06 Timothée Lesort , Andrei Stoian , David Filliat