Related papers: Regularized Linear Regression for Binary Classific…

One-Bit Quantization and Sparsification for Multiclass Linear Classification with Strong Regularization

We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, $\lambda f(w)$, for…

Machine Learning · Computer Science 2024-10-14 Reza Ghane , Danil Akhtiamov , Babak Hassibi

Benign Overfitting and the Geometry of the Ridge Regression Solution in Binary Classification

In this work, we investigate the behavior of ridge regression in an overparameterized binary classification task. We assume examples are drawn from (anisotropic) class-conditional cluster distributions with opposing means and we allow for…

Machine Learning · Statistics 2025-03-12 Alexander Tsigler , Luiz F. O. Chamon , Spencer Frei , Peter L. Bartlett

Robust Neural Network Classification via Double Regularization

The presence of mislabeled observations in data is a notoriously challenging problem in statistics and machine learning, associated with poor generalization properties for both traditional classifiers and, perhaps even more so, flexible…

Machine Learning · Statistics 2022-02-09 Olof Zetterqvist , Rebecka Jörnsten , Johan Jonasson

A new analytical approach to consistency and overfitting in regularized empirical risk minimization

This work considers the problem of binary classification: given training data $x_1, \dots, x_n$ from a certain population, together with associated labels $y_1,\dots, y_n \in \left\{0,1 \right\}$, determine the best label for an element $x$…

Statistics Theory · Mathematics 2016-07-04 Nicolas Garcia Trillos , Ryan Murray

Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network

Overparametrized neural networks trained by gradient descent (GD) can provably overfit any training data. However, the generalization guarantee may not hold for noisy data. From a nonparametric perspective, this paper studies how well…

Machine Learning · Statistics 2021-09-28 Tianyang Hu , Wenjia Wang , Cong Lin , Guang Cheng

When Does More Regularization Imply Fewer Degrees of Freedom? Sufficient Conditions and Counter Examples from Lasso and Ridge Regression

Regularization aims to improve prediction performance of a given statistical modeling approach by moving to a second approach which achieves worse training error but is expected to have fewer degrees of freedom, i.e., better agreement…

Statistics Theory · Mathematics 2013-11-13 Shachar Kaufman , Saharon Rosset

Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels

In recent years, research on learning with noisy labels has focused on devising novel algorithms that can achieve robustness to noisy training labels while generalizing to clean data. These algorithms often incorporate sophisticated…

Machine Learning · Computer Science 2023-07-12 Hui Kang , Sheng Liu , Huaxi Huang , Jun Yu , Bo Han , Dadong Wang , Tongliang Liu

Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting and Regularization

Deep neural networks generalize well despite being exceedingly overparameterized and being trained without explicit regularization. This curious phenomenon has inspired extensive research activity in establishing its statistical principles:…

Machine Learning · Statistics 2021-09-16 Ke Wang , Christos Thrampoulidis

Robustifying Binary Classification to Adversarial Perturbation

Despite the enormous success of machine learning models in various applications, most of these models lack resilience to (even small) perturbations in their input data. Hence, new methods to robustify machine learning models seem very…

Machine Learning · Computer Science 2020-10-30 Fariborz Salehi , Babak Hassibi

Regularization in network optimization via trimmed stochastic gradient descent with noisy label

Regularization is essential for avoiding over-fitting to training data in network optimization, leading to better generalization of the trained networks. The label noise provides a strong implicit regularization by replacing the target…

Machine Learning · Computer Science 2022-05-04 Kensuke Nakamura , Bong-Soo Sohn , Kyoung-Jae Won , Byung-Woo Hong

Learning with Noisy Labels via Sparse Regularization

Learning with noisy labels is an important and challenging task for training accurate deep neural networks. Some commonly-used loss functions, such as Cross Entropy (CE), suffer from severe overfitting to noisy labels. Robust loss functions…

Machine Learning · Computer Science 2021-08-03 Xiong Zhou , Xianming Liu , Chenyang Wang , Deming Zhai , Junjun Jiang , Xiangyang Ji

The Choice of Normalization Influences Shrinkage in Regularized Regression

Regularized models are often sensitive to the scales of the features in the data and it has therefore become standard practice to normalize (center and scale) the features before fitting the model. But there are many different ways to…

Machine Learning · Statistics 2025-07-04 Johan Larsson , Jonas Wallin

On Regularized Sparse Logistic Regression

Sparse logistic regression is for classification and feature selection simultaneously. Although many studies have been done to solve $\ell_1$-regularized logistic regression, there is no equivalently abundant work on solving sparse logistic…

Machine Learning · Computer Science 2023-10-13 Mengyuan Zhang , Kai Liu

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

The performance of a model trained with noisy labels is often improved by simply \textit{retraining} the model with its \textit{own predicted hard labels} (i.e., 1/0 labels). Yet, a detailed theoretical characterization of this phenomenon…

Machine Learning · Computer Science 2025-05-09 Rudrajit Das , Inderjit S. Dhillon , Alessandro Epasto , Adel Javanmard , Jieming Mao , Vahab Mirrokni , Sujay Sanghavi , Peilin Zhong

Statistical and Algorithmic Insights for Semi-supervised Learning with Self-training

Self-training is a classical approach in semi-supervised learning which is successfully applied to a variety of machine learning problems. Self-training algorithm generates pseudo-labels for the unlabeled examples and progressively refines…

Machine Learning · Computer Science 2020-06-22 Samet Oymak , Talha Cihad Gulcu

Learning Less-Overlapping Representations

In representation learning (RL), how to make the learned representations easy to interpret and less overfitted to training data are two important but challenging issues. To address these problems, we study a new type of regulariza- tion…

Machine Learning · Computer Science 2017-11-28 Pengtao Xie , Hongbao Zhang , Eric P. Xing

To tune or not to tune, a case study of ridge logistic regression in small or sparse datasets

For finite samples with binary outcomes penalized logistic regression such as ridge logistic regression (RR) has the potential of achieving smaller mean squared errors (MSE) of coefficients and predictions than maximum likelihood…

Methodology · Statistics 2021-01-28 Hana Šinkovec , Georg Heinze , Rok Blagus , Angelika Geroldinger

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

Self-Adaptive Training: beyond Empirical Risk Minimization

We propose self-adaptive training---a new training algorithm that dynamically corrects problematic training labels by model predictions without incurring extra computational cost---to improve generalization of deep learning for potentially…

Machine Learning · Computer Science 2020-10-01 Lang Huang , Chao Zhang , Hongyang Zhang

Consistency Regularization Can Improve Robustness to Label Noise

Consistency regularization is a commonly-used technique for semi-supervised and self-supervised learning. It is an auxiliary objective function that encourages the prediction of the network to be similar in the vicinity of the observed…

Machine Learning · Computer Science 2021-10-05 Erik Englesson , Hossein Azizpour