Related papers: Selecting Data Augmentation for Simulating Interve…

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks. Recently, researchers proposed a few intensive data augmentation techniques, which indeed…

Machine Learning · Computer Science 2019-11-22 Zhuoxun He , Lingxi Xie , Xin Chen , Ya Zhang , Yanfeng Wang , Qi Tian

Untapped Potential of Data Augmentation: A Domain Generalization Viewpoint

Data augmentation is a popular pre-processing trick to improve generalization accuracy. It is believed that by processing augmented inputs in tandem with the original ones, the model learns a more robust set of features which are shared…

Machine Learning · Computer Science 2020-07-10 Vihari Piratla , Shiv Shankar

Causality-inspired Single-source Domain Generalization for Medical Image Segmentation

Deep learning models usually suffer from domain shift issues, where models trained on one source domain do not generalize well to other unseen domains. In this work, we investigate the single-source domain generalization problem: training a…

Computer Vision and Pattern Recognition · Computer Science 2023-04-24 Cheng Ouyang , Chen Chen , Surui Li , Zeju Li , Chen Qin , Wenjia Bai , Daniel Rueckert

Domain Generalization by Rejecting Extreme Augmentations

Data augmentation is one of the most effective techniques for regularizing deep learning models and improving their recognition performance in a variety of tasks and domains. However, this holds for standard in-domain settings, in which the…

Machine Learning · Computer Science 2025-10-09 Masih Aminbeidokhti , Fidel A. Guerrero Peña , Heitor Rapela Medeiros , Thomas Dubail , Eric Granger , Marco Pedersoli

Learning to Compose Domain-Specific Transformations for Data Augmentation

Data augmentation is a ubiquitous technique for increasing the size of labeled training sets by leveraging task-specific data transformations that preserve class labels. While it is often easy for domain experts to specify individual…

Machine Learning · Statistics 2018-12-10 Alexander J. Ratner , Henry R. Ehrenberg , Zeshan Hussain , Jared Dunnmon , Christopher Ré

Rethinking Domain Generalization Baselines

Despite being very powerful in standard learning settings, deep learning models can be extremely brittle when deployed in scenarios different from those on which they were trained. Domain generalization methods investigate this problem and…

Computer Vision and Pattern Recognition · Computer Science 2021-01-28 Francesco Cappio Borlino , Antonio D'Innocente , Tatiana Tommasi

Foresee What You Will Learn: Data Augmentation for Domain Generalization in Non-stationary Environment

Existing domain generalization aims to learn a generalizable model to perform well even on unseen domains. For many real-world machine learning applications, the data distribution often shifts gradually along domain indices. For example, a…

Computer Vision and Pattern Recognition · Computer Science 2023-03-09 Qiuhao Zeng , Wei Wang , Fan Zhou , Charles Ling , Boyu Wang

Generalizing to Unseen Domains via Adversarial Data Augmentation

We are concerned with learning models that generalize well to different \emph{unseen} domains. We consider a worst-case formulation over data distributions that are near the source domain in the feature space. Only using training data from…

Computer Vision and Pattern Recognition · Computer Science 2018-11-07 Riccardo Volpi , Hongseok Namkoong , Ozan Sener , John Duchi , Vittorio Murino , Silvio Savarese

Domain Generalization -- A Causal Perspective

Machine learning models rely on various assumptions to attain high accuracy. One of the preliminary assumptions of these models is the independent and identical distribution, which suggests that the train and test data are sampled from the…

Machine Learning · Computer Science 2022-11-08 Paras Sheth , Raha Moraffah , K. Selçuk Candan , Adrienne Raglin , Huan Liu

Data Augmentation via Causal-Residual Bootstrapping

Data augmentation integrates domain knowledge into a dataset by making domain-informed modifications to existing data points. For example, image data can be augmented by duplicating images in different tints or orientations, thereby…

Machine Learning · Computer Science 2026-03-17 Mateusz Gajewski , Sophia Xiao , Bijan Mazaheri

An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a case…

Machine Learning · Computer Science 2026-02-02 Uzair Akbar , Niki Kilbertus , Hao Shen , Krikamol Muandet , Bo Dai

Learn to Expect the Unexpected: Probably Approximately Correct Domain Generalization

Domain generalization is the problem of machine learning when the training data and the test data come from different data domains. We present a simple theoretical model of learning to generalize across domains in which there is a…

Machine Learning · Computer Science 2020-02-14 Vikas K. Garg , Adam Kalai , Katrina Ligett , Zhiwei Steven Wu

Anti-causal domain generalization: Leveraging unlabeled data

The problem of domain generalization concerns learning predictive models that are robust to distribution shifts when deployed in new, previously unseen environments. Existing methods typically require labeled data from multiple training…

Machine Learning · Statistics 2026-02-20 Sorawit Saengkyongam , Juan L. Gamella , Andrew C. Miller , Jonas Peters , Nicolai Meinshausen , Christina Heinze-Deml

Data Augmentations for Improved (Large) Language Model Generalization

The reliance of text classifiers on spurious correlations can lead to poor generalization at deployment, raising concerns about their use in safety-critical domains such as healthcare. In this work, we propose to use counterfactual data…

Machine Learning · Computer Science 2024-01-10 Amir Feder , Yoav Wald , Claudia Shi , Suchi Saria , David Blei

Gaussian and Non-Gaussian Universality of Data Augmentation

We provide universality results that quantify how data augmentation affects the variance and limiting distribution of estimates through simple surrogates, and analyze several specific models in detail. The results confirm some observations…

Machine Learning · Computer Science 2025-12-03 Kevin Han Huang , Peter Orbanz , Morgane Austern

Dataset Augmentation in Feature Space

Dataset augmentation, the practice of applying a wide array of domain-specific transformations to synthetically expand a training set, is a standard tool in supervised learning. While effective in tasks such as visual recognition, the set…

Machine Learning · Statistics 2017-02-21 Terrance DeVries , Graham W. Taylor

An Empirical Framework for Domain Generalization in Clinical Settings

Clinical machine learning models experience significantly degraded performance in datasets not seen during training, e.g., new hospitals or populations. Recent developments in domain generalization offer a promising solution to this problem…

Machine Learning · Computer Science 2021-04-16 Haoran Zhang , Natalie Dullerud , Laleh Seyyed-Kalantari , Quaid Morris , Shalmali Joshi , Marzyeh Ghassemi

Generalization Gap in Data Augmentation: Insights from Illumination

In the field of computer vision, data augmentation is widely used to enrich the feature complexity of training datasets with deep learning techniques. However, regarding the generalization capabilities of models, the difference in…

Computer Vision and Pattern Recognition · Computer Science 2024-08-22 Jianqiang Xiao , Weiwen Guo , Junfeng Liu , Mengze Li

Domain Generalization without Excess Empirical Risk

Given data from diverse sets of distinct distributions, domain generalization aims to learn models that generalize to unseen distributions. A common approach is designing a data-driven surrogate penalty to capture generalization and…

Machine Learning · Computer Science 2023-08-31 Ozan Sener , Vladlen Koltun

Domain Generalization via Gradient Surgery

In real-life applications, machine learning models often face scenarios where there is a change in data distribution between training and test domains. When the aim is to make predictions on distributions different from those seen at…

Machine Learning · Computer Science 2021-11-04 Lucas Mansilla , Rodrigo Echeveste , Diego H. Milone , Enzo Ferrante