Related papers: Data Augmentation for Imbalanced Regression

Towards Understanding How Data Augmentation Works with Imbalanced Data

Data augmentation forms the cornerstone of many modern machine learning training pipelines; yet, the mechanisms by which it works are not clearly understood. Much of the research on data augmentation (DA) has focused on improving existing…

Machine Learning · Computer Science 2023-04-13 Damien A. Dablain , Nitesh V. Chawla

Regression Augmentation With Data-Driven Segmentation

Imbalanced regression arises when the target distribution is skewed, causing models to focus on dense regions and struggle with underrepresented (minority) samples. Despite its relevance across many applications, few methods have been…

Machine Learning · Computer Science 2025-08-05 Shayan Alahyari , Shiva Mehdipour Ghobadlou , Mike Domaratzki

Data Augmentation Imbalance For Imbalanced Attribute Classification

Pedestrian attribute recognition is an important multi-label classification problem. Although the convolutional neural networks are prominent in learning discriminative features from images, the data imbalance in multi-label setting for…

Computer Vision and Pattern Recognition · Computer Science 2020-05-22 Yang Hu , Xiaying Bai , Pan Zhou , Fanhua Shang , Shengmei Shen

An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a case…

Machine Learning · Computer Science 2026-02-02 Uzair Akbar , Niki Kilbertus , Hao Shen , Krikamol Muandet , Bo Dai

Resampling strategies for imbalanced regression: a survey and empirical analysis

Imbalanced problems can arise in different real-world situations, and to address this, certain strategies in the form of resampling or balancing algorithms are proposed. This issue has largely been studied in the context of classification,…

Machine Learning · Computer Science 2025-07-17 Juscimara G. Avelino , George D. C. Cavalcanti , Rafael M. O. Cruz

Non-Asymptotic Analysis of Data Augmentation for Precision Matrix Estimation

This paper addresses the problem of inverse covariance (also known as precision matrix) estimation in high-dimensional settings. Specifically, we focus on two classes of estimators: linear shrinkage estimators with a target proportional to…

Machine Learning · Statistics 2025-11-21 Lucas Morisset , Adrien Hardy , Alain Durmus

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks. Recently, researchers proposed a few intensive data augmentation techniques, which indeed…

Machine Learning · Computer Science 2019-11-22 Zhuoxun He , Lingxi Xie , Xin Chen , Ya Zhang , Yanfeng Wang , Qi Tian

Is augmentation effective to improve prediction in imbalanced text datasets?

Imbalanced datasets present a significant challenge for machine learning models, often leading to biased predictions. To address this issue, data augmentation techniques are widely used in natural language processing (NLP) to generate new…

Computation and Language · Computer Science 2023-04-21 Gabriel O. Assunção , Rafael Izbicki , Marcos O. Prates

Negative Data Augmentation

Data augmentation is often used to enlarge datasets with synthetic samples generated in accordance with the underlying data distribution. To enable a wider range of augmentations, we explore negative data augmentation strategies (NDA)that…

Computer Vision and Pattern Recognition · Computer Science 2021-02-11 Abhishek Sinha , Kumar Ayush , Jiaming Song , Burak Uzkent , Hongxia Jin , Stefano Ermon

The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective

Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning. Specific augmentations like translations and scaling in computer vision are traditionally believed to improve generalization by generating…

Machine Learning · Computer Science 2024-02-29 Chi-Heng Lin , Chiraag Kaushik , Eva L. Dyer , Vidya Muthukumar

Anchor Data Augmentation

We propose a novel algorithm for data augmentation in nonlinear over-parametrized regression. Our data augmentation algorithm borrows from the literature on causality and extends the recently proposed Anchor regression (AR) method for data…

Machine Learning · Computer Science 2023-11-29 Nora Schneider , Shirin Goshtasbpour , Fernando Perez-Cruz

A monotone data augmentation algorithm for multivariate nonnormal data: with applications to controlled imputations for longitudinal trials

An efficient monotone data augmentation (MDA) algorithm is proposed for missing data imputation for incomplete multivariate nonnormal data that may contain variables of different types, and are modeled by a sequence of regression models…

Methodology · Statistics 2018-11-21 Yongqiang Tang

Data augmentation for deep learning based accelerated MRI reconstruction with limited data

Deep neural networks have emerged as very successful tools for image restoration and reconstruction tasks. These networks are often trained end-to-end to directly reconstruct an image from a noisy or corrupted measurement of that image. To…

Image and Video Processing · Electrical Eng. & Systems 2021-06-30 Zalan Fabian , Reinhard Heckel , Mahdi Soltanolkotabi

Time Series Data Augmentation as an Imbalanced Learning Problem

Recent state-of-the-art forecasting methods are trained on collections of time series. These methods, often referred to as global models, can capture common patterns in different time series to improve their generalization performance.…

Machine Learning · Computer Science 2024-04-30 Vitor Cerqueira , Nuno Moniz , Ricardo Inácio , Carlos Soares

A Bit of Information Theory, and the Data Augmentation Algorithm Converges

The data augmentation (DA) algorithm is a simple and powerful tool in statistical computing. In this note basic information theory is used to prove a nontrivial convergence theorem for the DA algorithm.

Information Theory · Computer Science 2009-09-12 Yaming Yu

Imbalanced Classification via Explicit Gradient Learning From Augmented Data

Learning from imbalanced data is one of the most significant challenges in real-world classification tasks. In such cases, neural networks performance is substantially impaired due to preference towards the majority class. Existing…

Machine Learning · Computer Science 2022-11-13 Bronislav Yasinnik , Moshe Salhov , Ofir Lindenbaum , Amir Averbuch

Test-Time Augmentation Meets Variational Bayes

Data augmentation is known to contribute significantly to the robustness of machine learning models. In most instances, data augmentation is utilized during the training phase. Test-Time Augmentation (TTA) is a technique that instead…

Machine Learning · Statistics 2024-09-20 Masanari Kimura , Howard Bondell

ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

Data augmentation (DA) is a widely used technique for enhancing the training of deep neural networks. Recent DA techniques which achieve state-of-the-art performance always meet the need for diversity in augmented training samples. However,…

Computer Vision and Pattern Recognition · Computer Science 2022-12-06 Chenyang Wang , Junjun Jiang , Xiong Zhou , Xianming Liu

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the…

Machine Learning · Computer Science 2021-03-17 Mingyang Yi , Lu Hou , Lifeng Shang , Xin Jiang , Qun Liu , Zhi-Ming Ma

Universal Adaptive Data Augmentation

Existing automatic data augmentation (DA) methods either ignore updating DA's parameters according to the target model's state during training or adopt update strategies that are not effective enough. In this work, we design a novel data…

Computer Vision and Pattern Recognition · Computer Science 2023-05-11 Xiaogang Xu , Hengshuang Zhao