Related papers: Text AutoAugment: Learning Compositional Augmentat…

Data Augmentation Policy Search for Long-Term Forecasting

Data augmentation serves as a popular regularization technique to combat overfitting challenges in neural networks. While automatic augmentation has demonstrated success in image classification tasks, its application to time-series…

Machine Learning · Computer Science 2025-06-19 Liran Nochumsohn , Omri Azencot

Selective Text Augmentation with Word Roles for Low-Resource Text Classification

Data augmentation techniques are widely used in text classification tasks to improve the performance of classifiers, especially in low-resource scenarios. Most previous methods conduct text augmentation without considering the different…

Computation and Language · Computer Science 2022-09-07 Biyang Guo , Songqiao Han , Hailiang Huang

STA: Self-controlled Text Augmentation for Improving Text Classifications

Despite recent advancements in Machine Learning, many tasks still involve working in low-data regimes which can make solving natural language problems difficult. Recently, a number of text augmentation techniques have emerged in the field…

Computation and Language · Computer Science 2023-02-27 Congcong Wang , Gonzalo Fiz Pontiveros , Steven Derby , Tri Kurniawan Wijaya

On Evaluation Protocols for Data Augmentation in a Limited Data Scenario

Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks.…

Computation and Language · Computer Science 2024-09-18 Frédéric Piedboeuf , Philippe Langlais

Improved Text Classification via Test-Time Augmentation

Test-time augmentation -- the aggregation of predictions across transformed examples of test inputs -- is an established technique to improve the performance of image classification models. Importantly, TTA can be used to improve model…

Machine Learning · Computer Science 2022-06-29 Helen Lu , Divya Shanmugam , Harini Suresh , John Guttag

Distributional Data Augmentation Methods for Low Resource Language

Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to improve predictive performance. Synthetic data generation is common in numerous domains. However, recently text augmentation has emerged in…

Computation and Language · Computer Science 2023-09-12 Mosleh Mahamud , Zed Lee , Isak Samsten

A Survey on Data Augmentation for Text Classification

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

AEDA: An Easier Data Augmentation Technique for Text Classification

This paper proposes AEDA (An Easier Data Augmentation) technique to help improve the performance on text classification tasks. AEDA includes only random insertion of punctuation marks into the original text. This is an easier technique to…

Computation and Language · Computer Science 2021-08-31 Akbar Karimi , Leonardo Rossi , Andrea Prati

Learning Optimal Data Augmentation Policies via Bayesian Optimization for Image Classification Tasks

In recent years, deep learning has achieved remarkable achievements in many fields, including computer vision, natural language processing, speech recognition and others. Adequate training data is the key to ensure the effectiveness of the…

Machine Learning · Computer Science 2019-05-24 Chunxu Zhang , Jiaxu Cui , Bo Yang

Cross Encoding as Augmentation: Towards Effective Educational Text Classification

Text classification in education, usually called auto-tagging, is the automated process of assigning relevant tags to educational content, such as questions and textbooks. However, auto-tagging suffers from a data scarcity problem, which…

Computation and Language · Computer Science 2023-06-01 Hyun Seung Lee , Seungtaek Choi , Yunsung Lee , Hyeongdon Moon , Shinhyeok Oh , Myeongho Jeong , Hyojun Go , Christian Wallraven

Data Augmentation for Traffic Classification

Data Augmentation (DA) -- enriching training data by adding synthetic samples -- is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks to improve models performance. Yet, DA has struggled to gain…

Machine Learning · Computer Science 2024-01-24 Chao Wang , Alessandro Finamore , Pietro Michiardi , Massimo Gallo , Dario Rossi

AutoAugment: Learning Augmentation Policies from Data

Data augmentation is an effective technique for improving the accuracy of modern image classifiers. However, current data augmentation implementations are manually designed. In this paper, we describe a simple procedure called AutoAugment…

Computer Vision and Pattern Recognition · Computer Science 2019-04-15 Ekin D. Cubuk , Barret Zoph , Dandelion Mane , Vijay Vasudevan , Quoc V. Le

What Have Been Learned & What Should Be Learned? An Empirical Study of How to Selectively Augment Text for Classification

Text augmentation techniques are widely used in text classification problems to improve the performance of classifiers, especially in low-resource scenarios. Whilst lots of creative text augmentation methods have been designed, they augment…

Computation and Language · Computer Science 2021-09-02 Biyang Guo , Sonqiao Han , Hailiang Huang

Learning data augmentation policies using augmented random search

Previous attempts for data augmentation are designed manually, and the augmentation policies are dataset-specific. Recently, an automatic data augmentation approach, named AutoAugment, is proposed using reinforcement learning. AutoAugment…

Machine Learning · Computer Science 2018-11-13 Mingyang Geng , Kele Xu , Bo Ding , Huaimin Wang , Lei Zhang

AutoAugment Is What You Need: Enhancing Rule-based Augmentation Methods in Low-resource Regimes

Text data augmentation is a complex problem due to the discrete nature of sentences. Although rule-based augmentation methods are widely adopted in real-world applications because of their simplicity, they suffer from potential semantic…

Computation and Language · Computer Science 2024-02-09 Juhwan Choi , Kyohoon Jin , Junho Lee , Sangmin Song , Youngbin Kim

Deep AutoAugment

While recent automated data augmentation methods lead to state-of-the-art results, their design spaces and the derived data augmentation strategies still incorporate strong human priors. In this work, instead of fixing a set of hand-picked…

Computer Vision and Pattern Recognition · Computer Science 2022-03-16 Yu Zheng , Zhi Zhang , Shen Yan , Mi Zhang

SDA: Improving Text Generation with Self Data Augmentation

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of…

Computation and Language · Computer Science 2021-01-12 Ping Yu , Ruiyi Zhang , Yang Zhao , Yizhe Zhang , Chunyuan Li , Changyou Chen

A Survey of Automated Data Augmentation Algorithms for Deep Learning-based Image Classification Tasks

In recent years, one of the most popular techniques in the computer vision community has been the deep learning technique. As a data-driven technique, deep model requires enormous amounts of accurately labelled training data, which is often…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Zihan Yang , Richard O. Sinnott , James Bailey , Qiuhong Ke

DAGAM: Data Augmentation with Generation And Modification

Text classification is a representative downstream task of natural language processing, and has exhibited excellent performance since the advent of pre-trained language models based on Transformer architecture. However, in pre-trained…

Computation and Language · Computer Science 2022-04-07 Byeong-Cheol Jo , Tak-Sung Heo , Yeongjoon Park , Yongmin Yoo , Won Ik Cho , Kyungsun Kim

Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve…

Computation and Language · Computer Science 2022-07-25 Markus Bayer , Marc-André Kaufhold , Björn Buchhold , Marcel Keller , Jörg Dallmeyer , Christian Reuter