Related papers: Hypothesis Testing for Class-Conditional Noise Usi…

Hypothesis Testing for Class-Conditional Label Noise

In this paper we provide machine learning practitioners with tools to answer the question: is there class-conditional noise in my labels? In particular, we present hypothesis tests to check whether a given dataset of instance-label pairs…

Machine Learning · Computer Science 2021-06-02 Rafael Poyiadzi , Weisong Yang , Niall Twomey , Raul Santos-Rodriguez

Convex and Non-convex Approaches for Statistical Inference with Class-Conditional Noisy Labels

We study the problem of estimation and testing in logistic regression with class-conditional noise in the observed labels, which has an important implication in the Positive-Unlabeled (PU) learning setting. With the key observation that the…

Methodology · Statistics 2020-08-14 Hyebin Song , Ran Dai , Garvesh Raskutti , Rina Foygel Barber

Robust Binary Hypothesis Testing Under Contaminated Likelihoods

In hypothesis testing, the phenomenon of label noise, in which hypothesis labels are switched at random, contaminates the likelihood functions. In this paper, we develop a new method to determine the decision rule when we do not have…

Information Theory · Computer Science 2014-10-28 Dennis Wei , Kush R. Varshney

Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts

Noise in data appears to be inevitable in most real-world machine learning applications and would cause severe overfitting problems. Not only can data features contain noise, but labels are also prone to be noisy due to human input. In this…

Machine Learning · Computer Science 2025-05-09 Weipeng Huang , Qin Li , Yang Xiao , Cheng Qiao , Tie Cai , Junwei Liang , Neil J. Hurley , Guangyuan Piao

Binary Classification with Instance and Label Dependent Label Noise

Learning with label dependent label noise has been extensively explored in both theory and practice; however, dealing with instance (i.e., feature) and label dependent label noise continues to be a challenging task. The difficulty arises…

Machine Learning · Statistics 2023-06-07 Hyungki Im , Paul Grigas

Logistic-Normal Likelihoods for Heteroscedastic Label Noise

A natural way of estimating heteroscedastic label noise in regression is to model the observed (potentially noisy) target as a sample from a normal distribution, whose parameters can be learned by minimizing the negative log-likelihood.…

Machine Learning · Computer Science 2023-08-15 Erik Englesson , Amir Mehrpanah , Hossein Azizpour

Towards Robustness to Label Noise in Text Classification via Noise Modeling

Large datasets in NLP suffer from noisy labels, due to erroneous automatic and human annotation procedures. We study the problem of text classification with label noise, and aim to capture this noise through an auxiliary noise model over…

Computation and Language · Computer Science 2022-06-22 Siddhant Garg , Goutham Ramakrishnan , Varun Thumbe

Harmless label noise and informative soft-labels in supervised classification

Manual labelling of training examples is common practice in supervised learning. When the labelling task is of non-trivial difficulty, the supplied labels may not be equal to the ground-truth labels, and label noise is introduced into the…

Machine Learning · Statistics 2021-04-08 Daniel Ahfock , Geoffrey J. McLachlan

Classification with unknown class-conditional label noise on non-compact feature spaces

We investigate the problem of classification in the presence of unknown class-conditional label noise in which the labels observed by the learner have been corrupted with some unknown class dependent probability. In order to obtain finite…

Machine Learning · Statistics 2019-06-11 Henry W J Reeve , Ata Kaban

Regretful Decisions under Label Noise

Machine learning models are routinely used to support decisions that affect individuals -- be it to screen a patient for a serious illness or to gauge their response to treatment. In these tasks, we are limited to learning models from…

Machine Learning · Computer Science 2025-06-10 Sujay Nagaraj , Yang Liu , Flavio P. Calmon , Berk Ustun

Likelihood-based semi-supervised model selection with applications to speech processing

In conventional supervised pattern recognition tasks, model selection is typically accomplished by minimizing the classification error rate on a set of so-called development data, subject to ground-truth labeling by human experts or some…

Machine Learning · Statistics 2011-08-25 Christopher M. White , Sanjeev P. Khudanpur , Patrick J. Wolfe

Learning with Feature-Dependent Label Noise: A Progressive Approach

Label noise is frequently observed in real-world large-scale datasets. The noise is introduced due to a variety of reasons; it is heterogeneous and feature-dependent. Most existing approaches to handling noisy labels fall into two…

Machine Learning · Computer Science 2021-03-30 Yikai Zhang , Songzhu Zheng , Pengxiang Wu , Mayank Goswami , Chao Chen

Towards Integration of Statistical Hypothesis Tests into Deep Neural Networks

We report our ongoing work about a new deep architecture working in tandem with a statistical test procedure for jointly training texts and their label descriptions for multi-label and multi-class classification tasks. A statistical…

Computation and Language · Computer Science 2019-06-18 Ahmad Aghaebrahimian , Mark Cieliebak

An Instance-Dependent Simulation Framework for Learning with Label Noise

We propose a simulation framework for generating instance-dependent noisy labels via a pseudo-labeling paradigm. We show that the distribution of the synthetic noisy labels generated with our framework is closer to human labels compared to…

Machine Learning · Computer Science 2021-10-19 Keren Gu , Xander Masotto , Vandana Bachani , Balaji Lakshminarayanan , Jack Nikodem , Dong Yin

Error-Bounded Correction of Noisy Labels

To collect large scale annotated data, it is inevitable to introduce label noise, i.e., incorrect class labels. To be robust against label noise, many successful methods rely on the noisy classifiers (i.e., models trained on the noisy…

Computer Vision and Pattern Recognition · Computer Science 2020-11-23 Songzhu Zheng , Pengxiang Wu , Aman Goswami , Mayank Goswami , Dimitris Metaxas , Chao Chen

Classification with Asymmetric Label Noise: Consistency and Maximal Denoising

In many real-world classification problems, the labels of training examples are randomly corrupted. Most previous theoretical work on classification with label noise assumes that the two classes are separable, that the label noise is…

Machine Learning · Statistics 2016-08-08 Gilles Blanchard , Marek Flaska , Gregory Handy , Sara Pozzi , Clayton Scott

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

The drastic increase of data quantity often brings the severe decrease of data quality, such as incorrect label annotations, which poses a great challenge for robustly training Deep Neural Networks (DNNs). Existing learning \mbox{methods}…

Machine Learning · Computer Science 2022-03-18 Qizhou Wang , Bo Han , Tongliang Liu , Gang Niu , Jian Yang , Chen Gong

Label Ranking through Nonparametric Regression

Label Ranking (LR) corresponds to the problem of learning a hypothesis that maps features to rankings over a finite set of labels. We adopt a nonparametric regression approach to LR and obtain theoretical performance guarantees for this…

Machine Learning · Computer Science 2022-02-11 Dimitris Fotakis , Alkis Kalavasis , Eleni Psaroudaki

A Statistical Test for Probabilistic Fairness

Algorithms are now routinely used to make consequential decisions that affect human lives. Examples include college admissions, medical interventions or law enforcement. While algorithms empower us to harness all information hidden in vast…

Machine Learning · Computer Science 2020-12-10 Bahar Taskesen , Jose Blanchet , Daniel Kuhn , Viet Anh Nguyen

Fair Classification with Group-Dependent Label Noise

This work examines how to train fair classifiers in settings where training labels are corrupted with random noise, and where the error rates of corruption depend both on the label class and on the membership function for a protected…

Machine Learning · Computer Science 2021-02-18 Jialu Wang , Yang Liu , Caleb Levy