English
Related papers

Related papers: High Dimensional Binary Classification under Label…

200 papers

We propose Regularized Learning under Label shifts (RLLS), a principled and a practical domain-adaptation algorithm to correct for shifts in the label distribution between a source and a target domain. We first estimate importance weights…

Machine Learning · Computer Science 2020-08-10 Kamyar Azizzadenesheli , Anqi Liu , Fanny Yang , Animashree Anandkumar

Transformers serve as the foundational architecture for many successful large-scale models, demonstrating the ability to overfit the training data while maintaining strong generalization on unseen data, a phenomenon known as benign…

Machine Learning · Computer Science 2025-02-19 Yingying Zhang , Zhenyu Wu , Jian Li , Yong Liu

Recent years have witnessed a great success of supervised deep learning, where predictive models were trained from a large amount of fully labeled data. However, in practice, labeling such big data can be very costly and may not even be…

Machine Learning · Computer Science 2022-10-18 Yuting Tang , Nan Lu , Tianyi Zhang , Masashi Sugiyama

Class distribution plays an important role in learning deep classifiers. When the proportion of each class in the test set differs from the training set, the performance of classification nets usually degrades. Such a label distribution…

Image and Video Processing · Electrical Eng. & Systems 2022-07-12 Wenao Ma , Cheng Chen , Shuang Zheng , Jing Qin , Huimao Zhang , Qi Dou

The label shift problem refers to the supervised learning setting where the train and test label distributions do not match. Existing work addressing label shift usually assumes access to an \emph{unlabelled} test sample. This sample may be…

Machine Learning · Computer Science 2021-08-18 Jingzhao Zhang , Aditya Menon , Andreas Veit , Srinadh Bhojanapalli , Sanjiv Kumar , Suvrit Sra

We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, $\lambda f(w)$, for…

Machine Learning · Computer Science 2024-10-14 Reza Ghane , Danil Akhtiamov , Babak Hassibi

Classifier predictions often rely on the assumption that new observations come from the same distribution as training data. When the underlying distribution changes, so does the optimal classification rule, and performance may degrade. We…

Methodology · Statistics 2021-09-01 Ciaran Evans , Max G'Sell

Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network…

Machine Learning · Computer Science 2020-07-07 Weizhi Li , Gautam Dasarathy , Visar Berisha

Studies on benign overfitting provide insights for the success of overparameterized deep learning models. In this work, we examine whether overfitting is truly benign in real-world classification tasks. We start with the observation that a…

Machine Learning · Computer Science 2023-04-04 Kaiyue Wen , Jiaye Teng , Jingzhao Zhang

Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems. In this work, we compare two common strategies for encoding label constraints in a…

Machine Learning · Computer Science 2023-07-11 Kaifu Wang , Hangfeng He , Tin D. Nguyen , Piyush Kumar , Dan Roth

Label shift refers to the phenomenon where the prior class probability p(y) changes between the training and test distributions, while the conditional probability p(x|y) stays fixed. Label shift arises in settings like medical diagnosis,…

Machine Learning · Computer Science 2020-06-30 Amr Alexandari , Anshul Kundaje , Avanti Shrikumar

Handling imbalance in class distribution when building a classifier over tabular data has been a problem of long-standing interest. One popular approach is augmenting the training dataset with synthetically generated data. While classical…

Machine Learning · Computer Science 2025-02-20 Annie D'souza , Swetha M , Sunita Sarawagi

Regularized linear regression is a promising approach for binary classification problems in which the training set has noisy labels since the regularization term can help to avoid interpolating the mislabeled data points. In this paper we…

Machine Learning · Computer Science 2023-11-07 Danil Akhtiamov , Reza Ghane , Babak Hassibi

In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary…

Machine Learning · Computer Science 2020-12-01 Peter Bellmann , Heinke Hihn , Daniel A. Braun , Friedhelm Schwenker

Modern machine learning models with a large number of parameters often generalize well despite perfectly interpolating noisy training data - a phenomenon known as benign overfitting. A foundational explanation for this in linear…

Machine Learning · Statistics 2025-11-18 Yuta Kondo

While a broad range of techniques have been proposed to tackle distribution shift, the simple baseline of training on an $\textit{undersampled}$ balanced dataset often achieves close to state-of-the-art-accuracy across several popular…

Machine Learning · Computer Science 2023-06-21 Niladri S. Chatterji , Saminul Haque , Tatsunori Hashimoto

Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Drawing inspiration from this…

Machine Learning · Computer Science 2024-09-12 Ehsan Imani , Guojun Zhang , Runjia Li , Jun Luo , Pascal Poupart , Philip H. S. Torr , Yangchen Pan

Real-world data often exhibits long-tailed distributions with heavy class imbalance, posing great challenges for deep recognition models. We identify a persisting dilemma on the value of labels in the context of imbalanced learning: on the…

Machine Learning · Computer Science 2020-09-29 Yuzhe Yang , Zhi Xu

Deep Neural Networks are well known for efficiently fitting training data, yet experiencing poor generalization capabilities whenever some kind of bias dominates over the actual task labels, resulting in models learning "shortcuts". In…

Machine Learning · Computer Science 2024-08-12 Pietro Morerio , Ruggero Ragonesi , Vittorio Murino

Quantification is the supervised learning task that consists of training predictors of the class prevalence values of sets of unlabelled data, and is of special interest when the labelled data on which the predictor has been trained and the…

Machine Learning · Computer Science 2023-10-10 Pablo González , Alejandro Moreo , Fabrizio Sebastiani
‹ Prev 1 2 3 10 Next ›