Related papers: Data Boost: Text Data Augmentation Through Reinfor…

Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve…

Computation and Language · Computer Science 2022-07-25 Markus Bayer , Marc-André Kaufhold , Björn Buchhold , Marcel Keller , Jörg Dallmeyer , Christian Reuter

BDA: Bangla Text Data Augmentation Framework

Data augmentation involves generating synthetic samples that resemble those in a given dataset. In resource-limited fields where high-quality data is scarce, augmentation plays a crucial role in increasing the volume of training data. This…

Computation and Language · Computer Science 2024-12-30 Md. Tariquzzaman , Audwit Nafi Anam , Naimul Haque , Mohsinul Kabir , Hasan Mahmud , Md Kamrul Hasan

A Survey on Data Augmentation for Text Classification

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Data Augmentation for Text Generation Without Any Augmented Data

Data augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the…

Computation and Language · Computer Science 2021-05-31 Wei Bi , Huayang Li , Jiacheng Huang

Boosting for Functional Data

We deal with the task of supervised learning if the data is of functional type. The crucial point is the choice of the appropriate fitting method (learner). Boosting is a stepwise technique that combines learners in such a way that the…

Statistics Theory · Mathematics 2007-06-13 Nicole Kraemer

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

Data augmentation techniques are widely used in text classification tasks to improve the performance of classifiers, especially in low-resource scenarios. Most previous methods conduct text augmentation without considering the different…

Computation and Language · Computer Science 2022-09-07 Biyang Guo , Songqiao Han , Hailiang Huang

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity. State-of-the-art generative language models have been shown to provide significant…

Computation and Language · Computer Science 2023-01-10 Aleksandra Edwards , Asahi Ushio , Jose Camacho-Collados , Hélène de Ribaupierre , Alun Preece

On Automatic Data Augmentation for 3D Point Cloud Classification

Data augmentation is an important technique to reduce overfitting and improve learning performance, but existing works on data augmentation for 3D point cloud data are based on heuristics. In this work, we instead propose to automatically…

Computer Vision and Pattern Recognition · Computer Science 2021-12-21 Wanyue Zhang , Xun Xu , Fayao Liu , Le Zhang , Chuan-Sheng Foo

Neural Data-to-Text Generation with LM-based Text Augmentation

For many new application domains for data-to-text generation, the main obstacle in training neural models consists of a lack of training data. While usually large numbers of instances are available on the data side, often only very few text…

Computation and Language · Computer Science 2021-02-09 Ernie Chang , Xiaoyu Shen , Dawei Zhu , Vera Demberg , Hui Su

Data Augmentation for Traffic Classification

Data Augmentation (DA) -- enriching training data by adding synthetic samples -- is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks to improve models performance. Yet, DA has struggled to gain…

Machine Learning · Computer Science 2024-01-24 Chao Wang , Alessandro Finamore , Pietro Michiardi , Massimo Gallo , Dario Rossi

Controlled Text Generation for Data Augmentation in Intelligent Artificial Agents

Data availability is a bottleneck during early stages of development of new capabilities for intelligent artificial agents. We investigate the use of text generation techniques to augment the training data of a popular commercial artificial…

Computation and Language · Computer Science 2019-10-09 Nikolaos Malandrakis , Minmin Shen , Anuj Goyal , Shuyang Gao , Abhishek Sethi , Angeliki Metallinou

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

Leveraging Data Augmentation for Process Information Extraction

Business Process Modeling projects often require formal process models as a central component. High costs associated with the creation of such formal process models motivated many different fields of research aimed at automated generation…

Computation and Language · Computer Science 2024-04-12 Julian Neuberger , Leonie Doll , Benedict Engelmann , Lars Ackermann , Stefan Jablonski

Text AutoAugment: Learning Compositional Augmentation Policy for Text Classification

Data augmentation aims to enrich training samples for alleviating the overfitting issue in low-resource or class-imbalanced situations. Traditional methods first devise task-specific operations such as Synonym Substitute, then preset the…

Computation and Language · Computer Science 2021-09-03 Shuhuai Ren , Jinchao Zhang , Lei Li , Xu Sun , Jie Zhou

Improving Automated Feedback Systems for Tutor Training in Low-Resource Scenarios through Data Augmentation

Tutoring is an effective instructional method for enhancing student learning, yet its success relies on the skill and experience of the tutors. This reliance presents challenges for the widespread implementation of tutoring, particularly in…

Human-Computer Interaction · Computer Science 2025-10-21 Chentianye Xu , Jionghao Lin , Tongshuang Wu , Vincent Aleven , Kenneth R. Koedinger

Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs

In practice, it is common to find oneself with far too little text data to train a deep neural network. This "Big Data Wall" represents a challenge for minority language communities on the Internet, organizations, laboratories and companies…

Computation and Language · Computer Science 2018-12-13 Claude Coulombe

Boosting Source Code Learning with Text-Oriented Data Augmentation: An Empirical Study

Recent studies have demonstrated remarkable advancements in source code learning, which applies deep neural networks (DNNs) to tackle various software engineering tasks. Similar to other DNN-based domains, source code learning also requires…

Software Engineering · Computer Science 2025-02-07 Zeming Dong , Qiang Hu , Yuejun Guo , Zhenya Zhang , Maxime Cordy , Mike Papadakis , Yves Le Traon , Jianjun Zhao

EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks

We present EDA: easy data augmentation techniques for boosting performance on text classification tasks. EDA consists of four simple but powerful operations: synonym replacement, random insertion, random swap, and random deletion. On five…

Computation and Language · Computer Science 2019-08-27 Jason Wei , Kai Zou

Rethink the Effectiveness of Text Data Augmentation: An Empirical Analysis

In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has…

Computation and Language · Computer Science 2023-06-14 Zhengxiang Shi , Aldo Lipani

Distributional Data Augmentation Methods for Low Resource Language

Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to improve predictive performance. Synthetic data generation is common in numerous domains. However, recently text augmentation has emerged in…

Computation and Language · Computer Science 2023-09-12 Mosleh Mahamud , Zed Lee , Isak Samsten