English
Related papers

Related papers: STA: Self-controlled Text Augmentation for Improvi…

200 papers

Data augmentation techniques are widely used in text classification tasks to improve the performance of classifiers, especially in low-resource scenarios. Most previous methods conduct text augmentation without considering the different…

Computation and Language · Computer Science 2022-09-07 Biyang Guo , Songqiao Han , Hailiang Huang

Text augmentation techniques are widely used in text classification problems to improve the performance of classifiers, especially in low-resource scenarios. Whilst lots of creative text augmentation methods have been designed, they augment…

Computation and Language · Computer Science 2021-09-02 Biyang Guo , Sonqiao Han , Hailiang Huang

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of…

Computation and Language · Computer Science 2021-01-12 Ping Yu , Ruiyi Zhang , Yang Zhao , Yizhe Zhang , Chunyuan Li , Changyou Chen

Data augmentation aims to enrich training samples for alleviating the overfitting issue in low-resource or class-imbalanced situations. Traditional methods first devise task-specific operations such as Synonym Substitute, then preset the…

Computation and Language · Computer Science 2021-09-03 Shuhuai Ren , Jinchao Zhang , Lei Li , Xu Sun , Jie Zhou

In many cases of machine learning, research suggests that the development of training data might have a higher relevance than the choice and modelling of classifiers themselves. Thus, data augmentation methods have been developed to improve…

Computation and Language · Computer Science 2022-07-25 Markus Bayer , Marc-André Kaufhold , Björn Buchhold , Marcel Keller , Jörg Dallmeyer , Christian Reuter

Despite their recent successes in tackling many NLP tasks, large-scale pre-trained language models do not perform as well in few-shot settings where only a handful of training examples are available. To address this shortcoming, we propose…

Computation and Language · Computer Science 2022-04-13 Tu Vu , Minh-Thang Luong , Quoc V. Le , Grady Simon , Mohit Iyyer

Test-time augmentation -- the aggregation of predictions across transformed examples of test inputs -- is an established technique to improve the performance of image classification models. Importantly, TTA can be used to improve model…

Machine Learning · Computer Science 2022-06-29 Helen Lu , Divya Shanmugam , Harini Suresh , John Guttag

Text augmentation is a technique for constructing synthetic data from an under-resourced corpus to improve predictive performance. Synthetic data generation is common in numerous domains. However, recently text augmentation has emerged in…

Computation and Language · Computer Science 2023-09-12 Mosleh Mahamud , Zed Lee , Isak Samsten

This study conducts a thorough evaluation of text augmentation techniques across a variety of datasets and natural language processing (NLP) tasks to address the lack of reliable, generalized evidence for these methods. It examines the…

Computation and Language · Computer Science 2024-02-15 Himmet Toprak Kesgin , Mehmet Fatih Amasyali

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

Data availability is a bottleneck during early stages of development of new capabilities for intelligent artificial agents. We investigate the use of text generation techniques to augment the training data of a popular commercial artificial…

Computation and Language · Computer Science 2019-10-09 Nikolaos Malandrakis , Minmin Shen , Anuj Goyal , Shuyang Gao , Abhishek Sethi , Angeliki Metallinou

Self-training (ST) has prospered again in language understanding by augmenting the fine-tuning of pre-trained language models when labeled data is insufficient. However, it remains challenging to incorporate ST into attribute-controllable…

Computation and Language · Computer Science 2023-06-07 Yuxi Feng , Xiaoyuan Yi , Xiting Wang , Laks V. S. Lakshmanan , Xing Xie

Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model's generalization…

Computation and Language · Computer Science 2022-09-09 Markus Bayer , Marc-André Kaufhold , Christian Reuter

Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks.…

Computation and Language · Computer Science 2024-09-18 Frédéric Piedboeuf , Philippe Langlais

In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has…

Computation and Language · Computer Science 2023-06-14 Zhengxiang Shi , Aldo Lipani

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in…

Computation and Language · Computer Science 2022-06-28 Bohan Li , Yutai Hou , Wanxiang Che

Text data augmentation is an effective strategy for overcoming the challenge of limited sample sizes in many natural language processing (NLP) tasks. This challenge is especially prominent in the few-shot learning scenario, where the data…

In practice, it is common to find oneself with far too little text data to train a deep neural network. This "Big Data Wall" represents a challenge for minority language communities on the Internet, organizations, laboratories and companies…

Computation and Language · Computer Science 2018-12-13 Claude Coulombe

With the capabilities of understanding and executing natural language instructions, Large language models (LLMs) can potentially act as a powerful tool for textual data augmentation. However, the quality of augmented data depends heavily on…

Computation and Language · Computer Science 2024-04-30 Yichuan Li , Kaize Ding , Jianling Wang , Kyumin Lee

Business Process Modeling projects often require formal process models as a central component. High costs associated with the creation of such formal process models motivated many different fields of research aimed at automated generation…

Computation and Language · Computer Science 2024-04-12 Julian Neuberger , Leonie Doll , Benedict Engelmann , Lars Ackermann , Stefan Jablonski
‹ Prev 1 2 3 10 Next ›