Related papers: Label Smoothing Improves Neural Source Code Summar…

Adaptive Label Smoothing with Self-Knowledge in Natural Language Generation

Overconfidence has been shown to impair generalization and calibration of a neural network. Previous studies remedy this issue by adding a regularization term to a loss function, preventing a model from making a peaked distribution. Label…

Machine Learning · Computer Science 2022-10-26 Dongkyu Lee , Ka Chun Cheung , Nevin L. Zhang

Regularization via Structural Label Smoothing

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

An Investigation of how Label Smoothing Affects Generalization

It has been hypothesized that label smoothing can reduce overfitting and improve generalization, and current empirical evidence seems to corroborate these effects. However, there is a lack of mathematical understanding of when and why such…

Machine Learning · Computer Science 2020-10-27 Blair Chen , Liu Ziyin , Zihao Wang , Paul Pu Liang

When Does Label Smoothing Help?

The generalization and learning speed of a multi-class neural network can often be significantly improved by using soft targets that are a weighted average of the hard targets and the uniform distribution over labels. Smoothing the labels…

Machine Learning · Computer Science 2020-06-12 Rafael Müller , Simon Kornblith , Geoffrey Hinton

Label Smoothing++: Enhanced Label Regularization for Training Neural Networks

Training neural networks with one-hot target labels often results in overconfidence and overfitting. Label smoothing addresses this issue by perturbing the one-hot target labels by adding a uniform probability vector to create a regularized…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Sachin Chhabra , Hemanth Venkateswara , Baoxin Li

Semantic Label Smoothing for Sequence to Sequence Problems

Label smoothing has been shown to be an effective regularization strategy in classification, that prevents overfitting and helps in label de-noising. However, extending such methods directly to seq2seq settings, such as Machine Translation,…

Computation and Language · Computer Science 2020-10-16 Michal Lukasik , Himanshu Jain , Aditya Krishna Menon , Seungyeon Kim , Srinadh Bhojanapalli , Felix Yu , Sanjiv Kumar

Generating confidence calibrated outputs is of utmost importance for the applications of deep neural networks in safety-critical decision-making systems. The output of a neural network is a probability distribution where the scores are…

Machine Learning · Computer Science 2021-09-17 Chihuang Liu , Joseph JaJa

To Smooth or Not? When Label Smoothing Meets Noisy Labels

Label smoothing (LS) is an arising learning paradigm that uses the positively weighted average of both the hard training labels and uniformly distributed soft labels. It was shown that LS serves as a regularizer for training data with hard…

Machine Learning · Computer Science 2022-06-28 Jiaheng Wei , Hangyu Liu , Tongliang Liu , Gang Niu , Masashi Sugiyama , Yang Liu

Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing

Prior work has explored directly regularizing the output distributions of probabilistic models to alleviate peaky (i.e. over-confident) predictions, a common sign of overfitting. This class of techniques, of which label smoothing is one,…

Computation and Language · Computer Science 2020-05-13 Clara Meister , Elizabeth Salesky , Ryan Cotterell

Instance-based Label Smoothing For Better Calibrated Classification Networks

Label smoothing is widely used in deep neural networks for multi-class classification. While it enhances model generalization and reduces overconfidence by aiming to lower the probability for the predicted class, it distorts the predicted…

Machine Learning · Computer Science 2021-10-12 Mohamed Maher , Meelis Kull

Focus on the Target's Vocabulary: Masked Label Smoothing for Machine Translation

Label smoothing and vocabulary sharing are two widely used techniques in neural machine translation models. However, we argue that simply applying both techniques can be conflicting and even leads to sub-optimal performance. When allocating…

Computation and Language · Computer Science 2022-03-14 Liang Chen , Runxin Xu , Baobao Chang

Regularization via Adaptive Pairwise Label Smoothing

Label Smoothing (LS) is an effective regularizer to improve the generalization of state-of-the-art deep models. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over…

Machine Learning · Computer Science 2020-12-04 Hongyu Guo

Delving Deep into Label Smoothing

Label smoothing is an effective regularization tool for deep neural networks (DNNs), which generates soft labels by applying a weighted average between the uniform distribution and the hard label. It is often used to reduce the overfitting…

Computer Vision and Pattern Recognition · Computer Science 2021-07-23 Chang-Bin Zhang , Peng-Tao Jiang , Qibin Hou , Yunchao Wei , Qi Han , Zhen Li , Ming-Ming Cheng

Posterior Label Smoothing for Node Classification

Label smoothing is a widely studied regularization technique in machine learning. However, its potential for node classification in graph-structured data, spanning homophilic to heterophilic graphs, remains largely unexplored. We introduce…

Machine Learning · Computer Science 2026-02-02 Jaeseung Heo , Moonjeong Park , Dongwoo Kim

Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It

Label smoothing (LS) is a popular regularisation method for training neural networks as it is effective in improving test accuracy and is simple to implement. ``Hard'' one-hot labels are ``smoothed'' by uniformly distributing probability…

Machine Learning · Computer Science 2025-02-21 Guoxuan Xia , Olivier Laurent , Gianni Franchi , Christos-Savvas Bouganis

Does label smoothing mitigate label noise?

Label smoothing is commonly used in training deep learning models, wherein one-hot training labels are mixed with uniform label vectors. Empirically, smoothing has been shown to improve both predictive performance and model calibration. In…

Machine Learning · Computer Science 2020-03-06 Michal Lukasik , Srinadh Bhojanapalli , Aditya Krishna Menon , Sanjiv Kumar

The Implicit Length Bias of Label Smoothing on Beam Search Decoding

Label smoothing is ubiquitously applied in Neural Machine Translation (NMT) training. While label smoothing offers a desired regularization effect during model training, in this paper we demonstrate that it nevertheless introduces length…

Computation and Language · Computer Science 2022-05-03 Bowen Liang , Pidong Wang , Yuan Cao

LABO: Towards Learning Optimal Label Regularization via Bi-level Optimization

Regularization techniques are crucial to improving the generalization performance and training efficiency of deep neural networks. Many deep learning algorithms rely on weight decay, dropout, batch/layer normalization to converge faster and…

Machine Learning · Computer Science 2025-05-23 Peng Lu , Ahmad Rashid , Ivan Kobyzev , Mehdi Rezagholizadeh , Philippe Langlais

Locally Adaptive Label Smoothing for Predictive Churn

Training modern neural networks is an inherently noisy process that can lead to high \emph{prediction churn} -- disagreements between re-trainings of the same model due to factors such as randomization in the parameter initialization and…

Machine Learning · Computer Science 2021-06-15 Dara Bahri , Heinrich Jiang

Cross Entropy versus Label Smoothing: A Neural Collapse Perspective

Label smoothing loss is a widely adopted technique to mitigate overfitting in deep neural networks. This paper studies label smoothing from the perspective of Neural Collapse (NC), a powerful empirical and theoretical framework which…

Machine Learning · Computer Science 2025-09-30 Li Guo , George Andriopoulos , Zifan Zhao , Shuyang Ling , Zixuan Dong , Keith Ross