Related papers: Model Debiasing by Learnable Data Augmentation

Unsupervised Learning of Unbiased Visual Representations

Deep neural networks often struggle to learn robust representations in the presence of dataset biases, leading to suboptimal generalization on unbiased datasets. This limitation arises because the models heavily depend on peripheral and…

Machine Learning · Computer Science 2024-12-11 Carlo Alberto Barbano , Enzo Tartaglione , Marco Grangetto

Looking at Model Debiasing through the Lens of Anomaly Detection

It is widely recognized that deep neural networks are sensitive to bias in the data. This means that during training these models are likely to learn spurious correlations between data and labels, resulting in limited generalization…

Machine Learning · Computer Science 2024-12-06 Vito Paolo Pastore , Massimiliano Ciranni , Davide Marinelli , Francesca Odone , Vittorio Murino

Learning Not to Learn: Training Deep Neural Networks with Biased Data

We propose a novel regularization algorithm to train deep neural networks, in which data at training time is severely biased. Since a neural network efficiently learns data distribution, a network is likely to learn the bias information to…

Computer Vision and Pattern Recognition · Computer Science 2019-04-16 Byungju Kim , Hyunwoo Kim , Kyungsu Kim , Sungjin Kim , Junmo Kim

BiaSwap: Removing dataset bias with bias-tailored swapping augmentation

Deep neural networks often make decisions based on the spurious correlations inherent in the dataset, failing to generalize in an unbiased data distribution. Although previous approaches pre-define the type of dataset bias to prevent the…

Computer Vision and Pattern Recognition · Computer Science 2021-08-24 Eungyeup Kim , Jihyeon Lee , Jaegul Choo

Self-Adaptive Training: Bridging Supervised and Self-Supervised Learning

We propose self-adaptive training -- a unified training algorithm that dynamically calibrates and enhances training processes by model predictions without incurring an extra computational cost -- to advance both supervised and…

Machine Learning · Computer Science 2022-10-17 Lang Huang , Chao Zhang , Hongyang Zhang

Boosting Model Resilience via Implicit Adversarial Data Augmentation

Data augmentation plays a pivotal role in enhancing and diversifying training data. Nonetheless, consistently improving model performance in varied learning scenarios, especially those with inherent data biases, remains challenging. To…

Machine Learning · Computer Science 2024-06-04 Xiaoling Zhou , Wei Ye , Zhemg Lee , Rui Xie , Shikun Zhang

Debiased Self-Training for Semi-Supervised Learning

Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets. Yet these datasets are time-consuming and labor-exhaustive to obtain on realistic tasks. To mitigate the requirement…

Machine Learning · Computer Science 2022-11-10 Baixu Chen , Junguang Jiang , Ximei Wang , Pengfei Wan , Jianmin Wang , Mingsheng Long

Improving Bias Mitigation through Bias Experts in Natural Language Understanding

Biases in the dataset often enable the model to achieve high performance on in-distribution data, while poorly performing on out-of-distribution data. To mitigate the detrimental effect of the bias on the networks, previous works have…

Computation and Language · Computer Science 2023-12-07 Eojin Jeon , Mingyu Lee , Juhyeong Park , Yeachan Kim , Wing-Lam Mok , SangKeun Lee

Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data

Mislabeled samples are ubiquitous in real-world datasets as rule-based or expert labeling is usually based on incorrect assumptions or subject to biased opinions. Neural networks can "memorize" these mislabeled samples and, as a result,…

Machine Learning · Computer Science 2021-11-24 Katharina Rombach , Gabriel Michau , Olga Fink

Unsupervised Domain Alignment to Mitigate Low Level Dataset Biases

Dataset bias is a well-known problem in the field of computer vision. The presence of implicit bias in any image collection hinders a model trained and validated on a particular dataset to yield similar accuracies when tested on other…

Computer Vision and Pattern Recognition · Computer Science 2019-07-15 Kirthi Shankar Sivamani

Diffusing DeBias: Synthetic Bias Amplification for Model Debiasing

Deep learning model effectiveness in classification tasks is often challenged by the quality and quantity of training data whenever they are affected by strong spurious correlations between specific attributes and target labels. This…

Machine Learning · Computer Science 2025-10-27 Massimiliano Ciranni , Vito Paolo Pastore , Roberto Di Via , Enzo Tartaglione , Francesca Odone , Vittorio Murino

Learning from Failure: Training Debiased Classifier from Biased Classifier

Neural networks often learn to make predictions that overly rely on spurious correlation existing in the dataset, which causes the model to be biased. While previous work tackles this issue by using explicit labeling on the spuriously…

Machine Learning · Computer Science 2020-11-24 Junhyun Nam , Hyuntak Cha , Sungsoo Ahn , Jaeho Lee , Jinwoo Shin

Toward More Generalized Malicious URL Detection Models

This paper reveals a data bias issue that can severely affect the performance while conducting a machine learning model for malicious URL detection. We describe how such bias can be identified using interpretable machine learning…

Machine Learning · Computer Science 2024-02-12 YunDa Tsai , Cayon Liow , Yin Sheng Siang , Shou-De Lin

Confidence-based Reliable Learning under Dual Noises

Deep neural networks (DNNs) have achieved remarkable success in a variety of computer vision tasks, where massive labeled images are routinely required for model optimization. Yet, the data collected from the open world are unavoidably…

Computer Vision and Pattern Recognition · Computer Science 2023-02-13 Peng Cui , Yang Yue , Zhijie Deng , Jun Zhu

Leveraging Learning Bias for Noisy Anomaly Detection

This paper addresses the challenge of fully unsupervised image anomaly detection (FUIAD), where training data may contain unlabeled anomalies. Conventional methods assume anomaly-free training data, but real-world contamination leads models…

Computer Vision and Pattern Recognition · Computer Science 2025-10-20 Yuxin Zhang , Yunkang Cao , Yuqi Cheng , Yihan Sun , Weiming Shen

Improved Mixed-Example Data Augmentation

In order to reduce overfitting, neural networks are typically trained with data augmentation, the practice of artificially generating additional training data via label-preserving transformations of existing training examples. While these…

Computer Vision and Pattern Recognition · Computer Science 2019-01-23 Cecilia Summers , Michael J. Dinneen

Learning Debiased Representation via Disentangled Feature Augmentation

Image classification models tend to make decisions based on peripheral attributes of data items that have strong correlation with a target variable (i.e., dataset bias). These biased models suffer from the poor generalization capability…

Machine Learning · Computer Science 2021-10-26 Jungsoo Lee , Eungyeup Kim , Juyoung Lee , Jihyeon Lee , Jaegul Choo

A Semi-Supervised Two-Stage Approach to Learning from Noisy Labels

The recent success of deep neural networks is powered in part by large-scale well-labeled training data. However, it is a daunting task to laboriously annotate an ImageNet-like dateset. On the contrary, it is fairly convenient, fast, and…

Computer Vision and Pattern Recognition · Computer Science 2018-03-23 Yifan Ding , Liqiang Wang , Deliang Fan , Boqing Gong

Learning De-biased Representations with Biased Representations

Many machine learning algorithms are trained and evaluated by splitting data from a single source into training and test sets. While such focus on in-distribution learning scenarios has led to interesting advancement, it has not been able…

Computer Vision and Pattern Recognition · Computer Science 2020-07-02 Hyojin Bahng , Sanghyuk Chun , Sangdoo Yun , Jaegul Choo , Seong Joon Oh

Training Neural Networks on Data Sources with Unknown Reliability

When data is generated by multiple sources, conventional training methods update models assuming equal reliability for each source and do not consider their individual data quality. However, in many applications, sources have varied levels…

Machine Learning · Computer Science 2025-02-17 Alexander Capstick , Francesca Palermo , Tianyu Cui , Payam Barnaghi