Related papers: Neural Data Augmentation via Example Extrapolation

Improved Mixed-Example Data Augmentation

In order to reduce overfitting, neural networks are typically trained with data augmentation, the practice of artificially generating additional training data via label-preserving transformations of existing training examples. While these…

Computer Vision and Pattern Recognition · Computer Science 2019-01-23 Cecilia Summers , Michael J. Dinneen

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity. State-of-the-art generative language models have been shown to provide significant…

Computation and Language · Computer Science 2023-01-10 Aleksandra Edwards , Asahi Ushio , Jose Camacho-Collados , Hélène de Ribaupierre , Alun Preece

Few-shot learning through contextual data augmentation

Machine translation (MT) models used in industries with constantly changing topics, such as translation or news agencies, need to adapt to new data to maintain their performance over time. Our aim is to teach a pre-trained MT model to…

Computation and Language · Computer Science 2021-04-01 Farid Arthaud , Rachel Bawden , Alexandra Birch

Data Augmentation for Meta-Learning

Conventional image classifiers are trained by randomly sampling mini-batches of images. To achieve state-of-the-art performance, practitioners use sophisticated data augmentation schemes to expand the amount of training data available for…

Machine Learning · Computer Science 2021-06-23 Renkun Ni , Micah Goldblum , Amr Sharaf , Kezhi Kong , Tom Goldstein

Good-Enough Example Extrapolation

This paper asks whether extrapolating the hidden space distribution of text examples from one class onto another is a valid inductive bias for data augmentation. To operationalize this question, I propose a simple data augmentation protocol…

Computation and Language · Computer Science 2021-09-16 Jason Wei

Data Augmentation by Pairing Samples for Images Classification

Data augmentation is a widely used technique in many machine learning tasks, such as image classification, to virtually enlarge the training dataset size and avoid overfitting. Traditional data augmentation techniques for image…

Machine Learning · Computer Science 2018-04-12 Hiroshi Inoue

Extended Few-Shot Learning: Exploiting Existing Resources for Novel Tasks

In many practical few-shot learning problems, even though labeled examples are scarce, there are abundant auxiliary datasets that potentially contain useful information. We propose the problem of extended few-shot learning to study these…

Machine Learning · Computer Science 2021-07-06 Reza Esfandiarpoor , Amy Pu , Mohsen Hajabdollahi , Stephen H. Bach

Efficient Augmentation via Data Subsampling

Data augmentation is commonly used to encode invariances in learning methods. However, this process is often performed in an inefficient manner, as artificial examples are created by applying a number of transformations to all points in the…

Machine Learning · Computer Science 2019-03-04 Michael Kuchnik , Virginia Smith

Neural Data-to-Text Generation with LM-based Text Augmentation

For many new application domains for data-to-text generation, the main obstacle in training neural models consists of a lack of training data. While usually large numbers of instances are available on the data side, often only very few text…

Computation and Language · Computer Science 2021-02-09 Ernie Chang , Xiaoyu Shen , Dawei Zhu , Vera Demberg , Hui Su

MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning

Prompt-based learning has shown considerable promise in reformulating various downstream tasks as cloze problems by combining original input with a predetermined template. This approach demonstrates its effectiveness, especially in few-shot…

Computation and Language · Computer Science 2023-11-14 Bohan Li , Longxu Dou , Yutai Hou , Yunlong Feng , Honglin Mu , Qingfu Zhu , Qinghua Sun , Wanxiang Che

Self-Augmentation: Generalizing Deep Networks to Unseen Classes for Few-Shot Learning

Few-shot learning aims to classify unseen classes with a few training examples. While recent works have shown that standard mini-batch training with a carefully designed training strategy can improve generalization ability for unseen…

Machine Learning · Computer Science 2021-03-02 Jin-Woo Seo , Hong-Gyu Jung , Seong-Whan Lee

Effective Data Augmentation With Diffusion Models

Data augmentation is one of the most prevalent tools in deep learning, underpinning many recent advances, including those from classification, generative models, and representation learning. The standard approach to data augmentation…

Computer Vision and Pattern Recognition · Computer Science 2025-06-12 Brandon Trabucco , Kyle Doherty , Max Gurinas , Ruslan Salakhutdinov

MixBoost: Synthetic Oversampling with Boosted Mixup for Handling Extreme Imbalance

Training a classification model on a dataset where the instances of one class outnumber those of the other class is a challenging problem. Such imbalanced datasets are standard in real-world situations such as fraud detection, medical…

Machine Learning · Computer Science 2020-09-04 Anubha Kabra , Ayush Chopra , Nikaash Puri , Pinkesh Badjatiya , Sukriti Verma , Piyush Gupta , Balaji K

Data Augmentation for Neural Online Chat Response Selection

Data augmentation seeks to manipulate the available data for training to improve the generalization ability of models. We investigate two data augmentation proxies, permutation and flipping, for neural dialog response selection task on…

Computation and Language · Computer Science 2018-09-05 Wenchao Du , Alan W Black

Few-shot learning of neural networks from scratch by pseudo example optimization

In this paper, we propose a simple but effective method for training neural networks with a limited amount of training data. Our approach inherits the idea of knowledge distillation that transfers knowledge from a deep or wide reference…

Machine Learning · Statistics 2018-07-06 Akisato Kimura , Zoubin Ghahramani , Koh Takeuchi , Tomoharu Iwata , Naonori Ueda

Data Extrapolation for Text-to-image Generation on Small Datasets

Text-to-image generation requires large amount of training data to synthesizing high-quality images. For augmenting training data, previous methods rely on data interpolations like cropping, flipping, and mixing up, which fail to introduce…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Senmao Ye , Fei Liu

MaxDropoutV2: An Improved Method to Drop out Neurons in Convolutional Neural Networks

In the last decade, exponential data growth supplied the machine learning-based algorithms' capacity and enabled their usage in daily life activities. Additionally, such an improvement is partially explained due to the advent of deep…

Machine Learning · Computer Science 2022-03-08 Claudio Filipi Goncalves do Santos , Mateus Roder , Leandro A. Passos , João P. Papa

Prompt Selection and Augmentation for Few Examples Code Generation in Large Language Model and its Application in Robotics Control

Few-shot prompting and step-by-step reasoning have enhanced the capabilities of Large Language Models (LLMs) in tackling complex tasks including code generation. In this paper, we introduce a prompt selection and augmentation algorithm…

Robotics · Computer Science 2024-03-21 On Tai Wu , Frodo Kin Sun Chan , Zunhao Zhang , Yan Nei Law , Benny Drescher , Edmond Shiao Bun Lai

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the…

Machine Learning · Computer Science 2021-03-17 Mingyang Yi , Lu Hou , Lifeng Shang , Xin Jiang , Qun Liu , Zhi-Ming Ma

Engression: Extrapolation through the Lens of Distributional Regression

Distributional regression aims to estimate the full conditional distribution of a target variable, given covariates. Popular methods include linear and tree-ensemble based quantile regression. We propose a neural network-based…

Methodology · Statistics 2024-07-08 Xinwei Shen , Nicolai Meinshausen