Related papers: Quantifying Overfitting along the Regularization P…

A Minimum Description Length Approach to Regularization in Neural Networks

State-of-the-art neural networks can be trained to become remarkable solutions to many problems. But while these architectures can express symbolic, perfect solutions, trained models often arrive at approximations instead. We show that the…

Machine Learning · Computer Science 2025-09-09 Matan Abudy , Orr Well , Emmanuel Chemla , Roni Katzir , Nur Lan

Robust Neural Network Classification via Double Regularization

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible…

Machine Learning · Statistics 2022-02-09 Olof Zetterqvist , Rebecka Jörnsten , Johan Jonasson

High-dimensional Penalty Selection via Minimum Description Length Principle

We tackle the problem of penalty selection of regularization on the basis of the minimum description length (MDL) principle. In particular, we consider that the design space of the penalty function is high-dimensional. In this situation,…

Machine Learning · Statistics 2018-04-27 Kohei Miyaguchi , Kenji Yamanishi

One-Bit Quantization and Sparsification for Multiclass Linear Classification with Strong Regularization

We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, $\lambda f(w)$, for…

Machine Learning · Computer Science 2024-10-14 Reza Ghane , Danil Akhtiamov , Babak Hassibi

A Doubly Regularized Linear Discriminant Analysis Classifier with Automatic Parameter Selection

Linear discriminant analysis (LDA) based classifiers tend to falter in many practical settings where the training data size is smaller than, or comparable to, the number of features. As a remedy, different regularized LDA (RLDA) methods…

Machine Learning · Computer Science 2021-03-30 Alam Zaib , Tarig Ballal , Shahid Khattak , Tareq Y. Al-Naffouri

Revisiting minimum description length complexity in overparameterized models

Complexity is a fundamental concept underlying statistical learning theory that aims to inform generalization performance. Parameter count, while successful in low-dimensional settings, is not well-justified for overparameterized settings…

Machine Learning · Computer Science 2023-10-16 Raaz Dwivedi , Chandan Singh , Bin Yu , Martin J. Wainwright

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of…

Machine Learning · Statistics 2026-04-28 Weihao Lu , Qian Lin , Yingcun Xia , Dongming Huang

A new analytical approach to consistency and overfitting in regularized empirical risk minimization

This work considers the problem of binary classification: given training data $x_1, \dots, x_n$ from a certain population, together with associated labels $y_1,\dots, y_n \in \left\{0,1 \right\}$, determine the best label for an element $x$…

Statistics Theory · Mathematics 2016-07-04 Nicolas Garcia Trillos , Ryan Murray

PDL: Regularizing Multiple Instance Learning with Progressive Dropout Layers

Multiple instance learning (MIL) was a weakly supervised learning approach that sought to assign binary class labels to collections of instances known as bags. However, due to their weak supervision nature, the MIL methods were susceptible…

Computer Vision and Pattern Recognition · Computer Science 2024-05-27 Wenhui Zhu , Peijie Qiu , Xiwen Chen , Oana M. Dumitrascu , Yalin Wang

The Role of Mutual Information in Variational Classifiers

Overfitting data is a well-known phenomenon related with the generation of a model that mimics too closely (or exactly) a particular instance of data, and may therefore fail to predict future observations reliably. In practice, this…

Machine Learning · Statistics 2023-04-14 Matias Vera , Leonardo Rey Vega , Pablo Piantanida

High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

Label Shift has been widely believed to be harmful to the generalization performance of machine learning models. Researchers have proposed many approaches to mitigate the impact of the label shift, e.g., balancing the training data.…

Machine Learning · Computer Science 2022-12-09 Jiahui Cheng , Minshuo Chen , Hao Liu , Tuo Zhao , Wenjing Liao

Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network

Overparametrized neural networks trained by gradient descent (GD) can provably overfit any training data. However, the generalization guarantee may not hold for noisy data. From a nonparametric perspective, this paper studies how well…

Machine Learning · Statistics 2021-09-28 Tianyang Hu , Wenjia Wang , Cong Lin , Guang Cheng

Minimum Description Length and Generalization Guarantees for Representation Learning

A major challenge in designing efficient statistical supervised learning algorithms is finding representations that perform well not only on available training samples but also on unseen data. While the study of representation learning has…

Machine Learning · Statistics 2024-02-06 Milad Sefidgaran , Abdellatif Zaidi , Piotr Krasnowski

Path Regularization: A Near-Complete and Optimal Nonasymptotic Generalization Theory for Multilayer Neural Networks and Double Descent Phenomenon

Path regularization has shown to be a very effective regularization to train neural networks, leading to a better generalization property than common regularizations i.e. weight decay, etc. We propose a first near-complete (as will be made…

Machine Learning · Computer Science 2026-04-09 Hao Yu

Regularized Linear Regression for Binary Classification

Regularized linear regression is a promising approach for binary classification problems in which the training set has noisy labels since the regularization term can help to avoid interpolating the mislabeled data points. In this paper we…

Machine Learning · Computer Science 2023-11-07 Danil Akhtiamov , Reza Ghane , Babak Hassibi

Minimum Description Length Principle in Supervised Learning with Application to Lasso

The minimum description length (MDL) principle in supervised learning is studied. One of the most important theories for the MDL principle is Barron and Cover's theory (BC theory), which gives a mathematical justification of the MDL…

Information Theory · Computer Science 2016-07-12 Masanori Kawakita , Jun'ichi Takeuchi

Improved MDL Estimators Using Fiber Bundle of Local Exponential Families for Non-exponential Families

Minimum Description Length (MDL) estimators, using two-part codes for universal coding, are analyzed. For general parametric families under certain regularity conditions, we introduce a two-part code whose regret is close to the minimax…

Information Theory · Computer Science 2023-11-08 Kohei Miyamoto , Andrew R. Barron , Jun'ichi Takeuchi

Benign Overfitting under Learning Rate Conditions for $\alpha$ Sub-exponential Input

This paper investigates the phenomenon of benign overfitting in binary classification problems with heavy-tailed input distributions, extending the analysis of maximum margin classifiers to $\alpha$ sub-exponential distributions ($\alpha…

Machine Learning · Computer Science 2024-10-17 Kota Okudo , Kei Kobayashi

Minimum Description Length codes are critical

In the Minimum Description Length (MDL) principle, learning from the data is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are…

Methodology · Statistics 2018-10-03 Ryan John Cubero , Matteo Marsili , Yasser Roudi

A Penalty Approach for Normalizing Feature Distributions to Build Confounder-Free Models

Translating machine learning algorithms into clinical applications requires addressing challenges related to interpretability, such as accounting for the effect of confounding variables (or metadata). Confounding variables affect the…

Machine Learning · Computer Science 2022-07-12 Anthony Vento , Qingyu Zhao , Robert Paul , Kilian M. Pohl , Ehsan Adeli