Related papers: Label-Imbalanced and Group-Sensitive Classificatio…

On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data

Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE for training large models on label-imbalanced data far beyond the zero train error regime. The driving force behind…

Machine Learning · Computer Science 2023-03-15 Tina Behnia , Ganesh Ramachandra Kini , Vala Vakilian , Christos Thrampoulidis

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Deep learning algorithms can fare poorly when the training dataset suffers from heavy class-imbalance but the testing criterion requires good generalization on less frequent classes. We design two novel methods to improve performance in…

Machine Learning · Computer Science 2019-10-29 Kaidi Cao , Colin Wei , Adrien Gaidon , Nikos Arechiga , Tengyu Ma

On how to avoid exacerbating spurious correlations when models are overparameterized

Overparameterized models fail to generalize well in the presence of data imbalance even when combined with traditional techniques for mitigating imbalances. This paper focuses on imbalanced classification datasets, in which a small subset…

Machine Learning · Computer Science 2022-06-28 Tina Behnia , Ke Wang , Christos Thrampoulidis

Enhancement Encoding: A Novel Imbalanced Classification Approach via Encoding the Training Labels

Class imbalance, which is also called long-tailed distribution, is a common problem in classification tasks based on machine learning. If it happens, the minority data will be overwhelmed by the majority, which presents quite a challenge…

Machine Learning · Computer Science 2023-03-29 Jia-Chen Zhao

Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

In document classification for, e.g., legal and biomedical text, we often deal with hundreds of classes, including very infrequent ones, as well as temporal concept drift caused by the influence of real world events, e.g., policy changes,…

Computation and Language · Computer Science 2022-03-16 Ilias Chalkidis , Anders Søgaard

Synthetic Tabular Data Generation for Imbalanced Classification: The Surprising Effectiveness of an Overlap Class

Handling imbalance in class distribution when building a classifier over tabular data has been a problem of long-standing interest. One popular approach is augmenting the training dataset with synthetically generated data. While classical…

Machine Learning · Computer Science 2025-02-20 Annie D'souza , Swetha M , Sunita Sarawagi

Minority Class Oversampling for Tabular Data with Deep Generative Models

In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A…

Machine Learning · Computer Science 2020-07-21 Ramiro Camino , Christian Hammerschmidt , Radu State

Posterior Re-calibration for Imbalanced Datasets

Neural Networks can perform poorly when the training label distribution is heavily imbalanced, as well as when the testing data differs from the training distribution. In order to deal with shift in the testing label distribution, which…

Machine Learning · Computer Science 2020-10-23 Junjiao Tian , Yen-Cheng Liu , Nathan Glaser , Yen-Chang Hsu , Zsolt Kira

Multi-Label Adaptive Batch Selection by Highlighting Hard and Imbalanced Samples

Deep neural network models have demonstrated their effectiveness in classifying multi-label data from various domains. Typically, they employ a training mode that combines mini-batches with optimizers, where each sample is randomly selected…

Machine Learning · Computer Science 2024-03-28 Ao Zhou , Bin Liu , Jin Wang , Grigorios Tsoumakas

Towards Group Robustness in the presence of Partial Group Labels

Learning invariant representations is an important requirement when training machine learning models that are driven by spurious correlations in the datasets. These spurious correlations, between input samples and the target labels, wrongly…

Machine Learning · Computer Science 2022-01-12 Vishnu Suresh Lokhande , Kihyuk Sohn , Jinsung Yoon , Madeleine Udell , Chen-Yu Lee , Tomas Pfister

Training Over-parameterized Models with Non-decomposable Objectives

Many modern machine learning applications come with complex and nuanced design goals such as minimizing the worst-case error, satisfying a given precision or recall target, or enforcing group-fairness constraints. Popular techniques for…

Machine Learning · Computer Science 2021-07-13 Harikrishna Narasimhan , Aditya Krishna Menon

Classification Imbalance as Transfer Learning

Classification imbalance arises when one class is much rarer than the other. We frame this setting as transfer learning under label (prior) shift between an imbalanced source distribution induced by the observed data and a balanced target…

Machine Learning · Statistics 2026-01-16 Eric Xia , Jason M. Klusowski

Importance Tempering: Group Robustness for Overparameterized Models

Although overparameterized models have shown their success on many machine learning tasks, the accuracy could drop on the testing distribution that is different from the training one. This accuracy drop still limits applying machine…

Machine Learning · Computer Science 2022-09-29 Yiping Lu , Wenlong Ji , Zachary Izzo , Lexing Ying

Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data

Transductive graph-based semi-supervised learning methods usually build an undirected graph utilizing both labeled and unlabeled samples as vertices. Those methods propagate label information of labeled samples to neighbors through their…

Machine Learning · Computer Science 2013-12-25 Fengqi Li , Chuang Yu , Nanhai Yang , Feng Xia , Guangming Li , Fatemeh Kaveh-Yazdy

Imbalanced Classification via Explicit Gradient Learning From Augmented Data

Learning from imbalanced data is one of the most significant challenges in real-world classification tasks. In such cases, neural networks performance is substantially impaired due to preference towards the majority class. Existing…

Machine Learning · Computer Science 2022-11-13 Bronislav Yasinnik , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

Learning Majority-to-Minority Transformations with MMD and Triplet Loss for Imbalanced Classification

Class imbalance in supervised classification often degrades model performance by biasing predictions toward the majority class, particularly in critical applications such as medical diagnosis and fraud detection. Traditional oversampling…

Machine Learning · Statistics 2025-09-16 Suman Cha , Hyunjoong Kim

Class Adaptive Network Calibration

Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty…

Computer Vision and Pattern Recognition · Computer Science 2023-04-13 Bingyuan Liu , Jérôme Rony , Adrian Galdran , Jose Dolz , Ismail Ben Ayed

Rethinking the Value of Labels for Improving Class-Imbalanced Learning

Real-world data often exhibits long-tailed distributions with heavy class imbalance, posing great challenges for deep recognition models. We identify a persisting dilemma on the value of labels in the context of imbalanced learning: on the…

Machine Learning · Computer Science 2020-09-29 Yuzhe Yang , Zhi Xu

Towards Imbalanced Large Scale Multi-label Classification with Partially Annotated Labels

Multi-label classification is a widely encountered problem in daily life, where an instance can be associated with multiple classes. In theory, this is a supervised learning method that requires a large amount of labeling. However,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-02 XIn Zhang , Yuqi Song , Fei Zuo , Xiaofeng Wang

The Devil is in the Margin: Margin-based Label Smoothing for Network Calibration

In spite of the dominant performances of deep neural networks, recent works have shown that they are poorly calibrated, resulting in over-confident predictions. Miscalibration can be exacerbated by overfitting due to the minimization of the…

Computer Vision and Pattern Recognition · Computer Science 2023-07-06 Bingyuan Liu , Ismail Ben Ayed , Adrian Galdran , Jose Dolz