Related papers: REDUCR: Robust Data Downsampling Using Class Prior…

Ranking & Reweighting Improves Group Distributional Robustness

Recent work has shown that standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on underrepresented groups due to the prevalence of spurious features. A…

Machine Learning · Computer Science 2023-05-11 Yachuan Liu , Bohan Zhang , Qiaozhu Mei , Paramveer Dhillon

Neural Network Based Undersampling Techniques

Class imbalance problem is commonly faced while developing machine learning models for real-life issues. Due to this problem, the fitted model tends to be biased towards the majority class data, which leads to lower precision, recall, AUC,…

Machine Learning · Computer Science 2019-08-20 Md. Adnan Arefeen , Sumaiya Tabassum Nimi , M Sohel Rahman

DRoP: Distributionally Robust Data Pruning

In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by removing redundant or uninformative samples from the…

Machine Learning · Computer Science 2025-02-11 Artem Vysogorets , Kartik Ahuja , Julia Kempe

Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

Despite the remarkable generation capabilities of Diffusion Models (DMs), conducting training and inference remains computationally expensive. Previous works have been devoted to accelerating diffusion sampling, but achieving data-efficient…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Yize Li , Yihua Zhang , Sijia Liu , Xue Lin

Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization

We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample re-weighting. Leveraging insights from distributionally robust optimization (DRO)…

Machine Learning · Computer Science 2024-10-15 Ramnath Kumar , Kushal Majmundar , Dheeraj Nagaraj , Arun Sai Suggala

Queue-based Resampling for Online Class Imbalance Learning

Online class imbalance learning constitutes a new problem and an emerging research topic that focusses on the challenges of online learning under class imbalance and concept drift. Class imbalance deals with data streams that have very…

Machine Learning · Computer Science 2020-09-02 Kleanthis Malialis , Christos G. Panayiotou , Marios M. Polycarpou

Learning to Reweight Examples for Robust Deep Learning

Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to…

Machine Learning · Computer Science 2019-05-07 Mengye Ren , Wenyuan Zeng , Bin Yang , Raquel Urtasun

Meta-Learning for Resampling Recommendation Systems

One possible approach to tackle the class imbalance in classification tasks is to resample a training dataset, i.e., to drop some of its elements or to synthesize new ones. There exist several widely-used resampling methods. Recent research…

Machine Learning · Computer Science 2018-09-18 Smolyakov Dmitry , Alexander Korotin , Pavel Erofeev , Artem Papanov , Evgeny Burnaev

ReSpec: Relevance and Specificity Grounded Online Filtering for Learning on Video-Text Data Streams

The rapid growth of video-text data presents challenges in storage and computation during training. Online learning, which processes streaming data in real-time, offers a promising solution to these issues while also allowing swift…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Chris Dongjoo Kim , Jihwan Moon , Sangwoo Moon , Heeseung Yun , Sihaeng Lee , Aniruddha Kembhavi , Soonyoung Lee , Gunhee Kim , Sangho Lee , Christopher Clark

Exploring Data Redundancy in Real-world Image Classification through Data Selection

Deep learning models often require large amounts of data for training, leading to increased costs. It is particularly challenging in medical imaging, i.e., gathering distributed data for centralized training, and meanwhile, obtaining…

Computer Vision and Pattern Recognition · Computer Science 2023-06-27 Zhenyu Tang , Shaoting Zhang , Xiaosong Wang

Does the Order of Training Samples Matter? Improving Neural Data-to-Text Generation with Curriculum Learning

Recent advancements in data-to-text generation largely take on the form of neural end-to-end systems. Efforts have been dedicated to improving text generation systems by changing the order of training samples in a process known as…

Computation and Language · Computer Science 2021-02-09 Ernie Chang , Hui-Syuan Yeh , Vera Demberg

Introducing DeepBalance: Random Deep Belief Network Ensembles to Address Class Imbalance

Class imbalance problems manifest in domains such as financial fraud detection or network intrusion analysis, where the prevalence of one class is much higher than another. Typically, practitioners are more interested in predicting the…

Machine Learning · Statistics 2017-11-16 Peter Xenopoulos

Clustering and Learning from Imbalanced Data

A learning classifier must outperform a trivial solution, in case of imbalanced data, this condition usually does not hold true. To overcome this problem, we propose a novel data level resampling method - Clustering Based Oversampling for…

Machine Learning · Computer Science 2018-11-13 Naman D. Singh , Abhinav Dhall

ClassPruning: Speed Up Image Restoration Networks by Dynamic N:M Pruning

Image restoration tasks have achieved tremendous performance improvements with the rapid advancement of deep neural networks. However, most prevalent deep learning models perform inference statically, ignoring that different images have…

Computer Vision and Pattern Recognition · Computer Science 2022-11-11 Yang Zhou , Yuda Song , Hui Qian , Xin Du

Improved Multi-label Classification under Temporal Concept Drift: Rethinking Group-Robust Algorithms in a Label-Wise Setting

In document classification for, e.g., legal and biomedical text, we often deal with hundreds of classes, including very infrequent ones, as well as temporal concept drift caused by the influence of real world events, e.g., policy changes,…

Computation and Language · Computer Science 2022-03-16 Ilias Chalkidis , Anders Søgaard

Group & Reweight: A Novel Cost-Sensitive Approach to Mitigating Class Imbalance in Network Traffic Classification

Internet services have led to the eruption of network traffic, and machine learning on these Internet data has become an indispensable tool, especially when the application is risk-sensitive. This paper focuses on network traffic…

Machine Learning · Statistics 2025-02-12 Wumei Du , Dong Liang , Yiqin Lv , Xingxing Liang , Guanlin Wu , Qi Wang , Zheng Xie

Constrained Instance and Class Reweighting for Robust Learning under Label Noise

Deep neural networks have shown impressive performance in supervised learning, enabled by their ability to fit well to the provided training data. However, their performance is largely dependent on the quality of the training data and often…

Machine Learning · Computer Science 2021-11-11 Abhishek Kumar , Ehsan Amid

CrossQ: Batch Normalization in Deep Reinforcement Learning for Greater Sample Efficiency and Simplicity

Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the…

Machine Learning · Computer Science 2024-03-26 Aditya Bhatt , Daniel Palenicek , Boris Belousov , Max Argus , Artemij Amiranashvili , Thomas Brox , Jan Peters

Restoring balance: principled under/oversampling of data for optimal classification

Class imbalance in real-world data poses a common bottleneck for machine learning tasks, since achieving good generalization on under-represented examples is often challenging. Mitigation strategies, such as under or oversampling the data…

Disordered Systems and Neural Networks · Physics 2025-02-03 Emanuele Loffredo , Mauro Pastore , Simona Cocco , Rémi Monasson

Influence of Resampling on Accuracy of Imbalanced Classification

In many real-world binary classification tasks (e.g. detection of certain objects from images), an available dataset is imbalanced, i.e., it has much less representatives of a one class (a minor class), than of another. Generally, accurate…

Machine Learning · Statistics 2017-07-14 Evgeny Burnaev , Pavel Erofeev , Artem Papanov