Related papers: Reverse Operation based Data Augmentation for Solv…

Data Augmentation with In-Context Learning and Comparative Evaluation in Math Word Problem Solving

Math Word Problem (MWP) solving presents a challenging task in Natural Language Processing (NLP). This study aims to provide MWP solvers with a more diverse training set, ultimately improving their ability to solve various math problems. We…

Computation and Language · Computer Science 2024-05-02 Gulsum Yigit , Mehmet Fatih Amasyali

Deterministic Reversible Data Augmentation for Neural Machine Translation

Data augmentation is an effective way to diversify corpora in machine translation, but previous methods may introduce semantic inconsistency between original and augmented data because of irreversible operations and random subword sampling…

Computation and Language · Computer Science 2025-02-21 Jiashu Yao , Heyan Huang , Zeming Liu , Yuhang Guo

Rethinking Data Augmentation for Low-Resource Neural Machine Translation: A Multi-Task Learning Approach

In the context of neural machine translation, data augmentation (DA) techniques may be used for generating additional training samples when the available parallel data are scarce. Many DA approaches aim at expanding the support of the…

Computation and Language · Computer Science 2021-09-09 Víctor M. Sánchez-Cartagena , Miquel Esplà-Gomis , Juan Antonio Pérez-Ortiz , Felipe Sánchez-Martínez

Abstract Meaning Representation-Based Logic-Driven Data Augmentation for Logical Reasoning

Combining large language models with logical reasoning enhances their capacity to address problems in a robust and reliable manner. Nevertheless, the intricate nature of logical reasoning poses challenges when gathering reliable data from…

Computation and Language · Computer Science 2025-04-18 Qiming Bao , Alex Yuxuan Peng , Zhenyun Deng , Wanjun Zhong , Gael Gendron , Timothy Pistotti , Neset Tan , Nathan Young , Yang Chen , Yonghua Zhu , Paul Denny , Michael Witbrock , Jiamou Liu

Soft Contextual Data Augmentation for Neural Machine Translation

While data augmentation is an important trick to boost the accuracy of deep learning methods in computer vision tasks, its study in natural language tasks is still very limited. In this paper, we present a novel data augmentation method for…

Computation and Language · Computer Science 2019-05-28 Jinhua Zhu , Fei Gao , Lijun Wu , Yingce Xia , Tao Qin , Wengang Zhou , Xueqi Cheng , Tie-Yan Liu

CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding

Data augmentation has been demonstrated as an effective strategy for improving model generalization and data efficiency. However, due to the discrete nature of natural language, designing label-preserving transformations for text data tends…

Computation and Language · Computer Science 2020-10-20 Yanru Qu , Dinghan Shen , Yelong Shen , Sandra Sajeev , Jiawei Han , Weizhu Chen

Practice Makes a Solver Perfect: Data Augmentation for Math Word Problem Solvers

Existing Math Word Problem (MWP) solvers have achieved high accuracy on benchmark datasets. However, prior works have shown that such solvers do not generalize well and rely on superficial cues to achieve high performance. In this paper, we…

Computation and Language · Computer Science 2022-05-03 Vivek Kumar , Rishabh Maheshwary , Vikram Pudi

Generalization in Reinforcement Learning by Soft Data Augmentation

Extensive efforts have been made to improve the generalization ability of Reinforcement Learning (RL) methods via domain randomization and data augmentation. However, as more factors of variation are introduced during training, optimization…

Machine Learning · Computer Science 2021-04-12 Nicklas Hansen , Xiaolong Wang

Rethinking Rotation in Self-Supervised Contrastive Learning: Adaptive Positive or Negative Data Augmentation

Rotation is frequently listed as a candidate for data augmentation in contrastive learning but seldom provides satisfactory improvements. We argue that this is because the rotated image is always treated as either positive or negative. The…

Computer Vision and Pattern Recognition · Computer Science 2022-11-28 Atsuyuki Miyai , Qing Yu , Daiki Ikami , Go Irie , Kiyoharu Aizawa

Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks

Despite large successes of recent language models on diverse tasks, they suffer from severe performance degeneration in low-resource settings with limited training data available. Many existing works tackle this problem by generating…

Computation and Language · Computer Science 2024-02-22 Minju Seo , Jinheon Baek , James Thorne , Sung Ju Hwang

Syntax-aware Data Augmentation for Neural Machine Translation

Data augmentation is an effective performance enhancement in neural machine translation (NMT) by generating additional bilingual data. In this paper, we propose a novel data augmentation enhancement strategy for neural machine translation.…

Computation and Language · Computer Science 2020-04-30 Sufeng Duan , Hai Zhao , Dongdong Zhang , Rui Wang

Not Enough Data? Deep Learning to the Rescue!

Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially…

Computation and Language · Computer Science 2019-11-28 Ateret Anaby-Tavor , Boaz Carmeli , Esther Goldbraich , Amir Kantor , George Kour , Segev Shlomov , Naama Tepper , Naama Zwerdling

MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms

We introduce a large-scale dataset of math word problems and an interpretable neural math problem solver that learns to map problems to operation programs. Due to annotation challenges, current datasets in this domain have been either…

Computation and Language · Computer Science 2019-06-03 Aida Amini , Saadia Gabriel , Peter Lin , Rik Koncel-Kedziorski , Yejin Choi , Hannaneh Hajishirzi

Data Augmentation Approaches in Natural Language Processing: A Survey

As an effective strategy, data augmentation (DA) alleviates data scarcity scenarios where deep learning techniques may fail. It is widely applied in computer vision then introduced to natural language processing and achieves improvements in…

Computation and Language · Computer Science 2022-06-28 Bohan Li , Yutai Hou , Wanxiang Che

SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation

In this work, we examine methods for data augmentation for text-based tasks such as neural machine translation (NMT). We formulate the design of a data augmentation policy with desirable properties as an optimization problem, and derive a…

Computation and Language · Computer Science 2018-08-29 Xinyi Wang , Hieu Pham , Zihang Dai , Graham Neubig

Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource Sinhala-English Neural Machine Translation

Out-of-Vocabulary (OOV) is a problem for Neural Machine Translation (NMT). OOV refers to words with a low occurrence in the training data, or to those that are absent from the training data. To alleviate this, word or phrase-based Data…

Computation and Language · Computer Science 2022-05-19 Aloka Fernando , Surangika Ranathunga

N-Best Hypotheses Reranking for Text-To-SQL Systems

Text-to-SQL task maps natural language utterances to structured queries that can be issued to a database. State-of-the-art (SOTA) systems rely on finetuning large, pre-trained language models in conjunction with constrained decoding…

Computation and Language · Computer Science 2022-10-20 Lu Zeng , Sree Hari Krishnan Parthasarathi , Dilek Hakkani-Tur

SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness

Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples. Data augmentation is a common method used to prevent overfitting and improve OOD generalization. However, in natural language, it is…

Computation and Language · Computer Science 2020-10-06 Nathan Ng , Kyunghyun Cho , Marzyeh Ghassemi

SDA: Simple Discrete Augmentation for Contrastive Sentence Representation Learning

Contrastive learning has recently achieved compelling performance in unsupervised sentence representation. As an essential element, data augmentation protocols, however, have not been well explored. The pioneering work SimCSE resorting to a…

Computation and Language · Computer Science 2024-06-17 Dongsheng Zhu , Zhenyu Mao , Jinghui Lu , Rui Zhao , Fei Tan

SDA: Improving Text Generation with Self Data Augmentation

Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of…

Computation and Language · Computer Science 2021-01-12 Ping Yu , Ruiyi Zhang , Yang Zhao , Yizhe Zhang , Chunyuan Li , Changyou Chen