English
Related papers

Related papers: Parallel Data Augmentation for Formality Style Tra…

200 papers

Formality style transformation is the task of modifying the formality of a given sentence without changing its content. Its challenge is the lack of large-scale sentence-aligned parallel data. In this paper, we propose an omnivorous model…

Computation and Language · Computer Science 2019-03-18 Ruochen Xu , Tao Ge , Furu Wei

Scarcity of parallel data causes formality style transfer models to have scarce success in preserving content. We show that fine-tuning pre-trained language (GPT-2) and sequence-to-sequence (BART) models boosts content preservation, and…

Computation and Language · Computer Science 2021-07-06 Huiyuan Lai , Antonio Toral , Malvina Nissim

Formality style transfer (FST) is a task that involves paraphrasing an informal sentence into a formal one without altering its meaning. To address the data-scarcity problem of existing parallel datasets, previous studies tend to adopt a…

Computation and Language · Computer Science 2022-03-28 Ao Liu , An Wang , Naoaki Okazaki

Style transfer is the task of automatically transforming a piece of text in one particular style into another. A major barrier to progress in this field has been a lack of training and evaluation datasets, as well as benchmarks and…

Computation and Language · Computer Science 2018-04-17 Sudha Rao , Joel Tetreault

We introduce style augmentation, a new form of data augmentation based on random style transfer, for improving the robustness of convolutional neural networks (CNN) over both classification and regression based tasks. During training, our…

Computer Vision and Pattern Recognition · Computer Science 2019-04-15 Philip T. Jackson , Amir Atapour-Abarghouei , Stephen Bonner , Toby Breckon , Boguslaw Obara

In the context of neural machine translation, data augmentation (DA) techniques may be used for generating additional training samples when the available parallel data are scarce. Many DA approaches aim at expanding the support of the…

Computation and Language · Computer Science 2021-09-09 Víctor M. Sánchez-Cartagena , Miquel Esplà-Gomis , Juan Antonio Pérez-Ortiz , Felipe Sánchez-Martínez

Despite the evolution of language models, they continue to portray harmful societal biases and stereotypes inadvertently learned from training data. These inherent biases often result in detrimental effects in various applications.…

Computation and Language · Computer Science 2024-07-24 Ewoenam Kwaku Tokpo , Toon Calders

In recent years, language models (LMs) have made remarkable progress in advancing the field of natural language processing (NLP). However, the impact of data augmentation (DA) techniques on the fine-tuning (FT) performance of these LMs has…

Computation and Language · Computer Science 2023-06-14 Zhengxiang Shi , Aldo Lipani

Despite the rapid growth in model architecture, the scarcity of large parallel corpora remains the main bottleneck in Neural Machine Translation. Data augmentation is a technique that enhances the performance of data-hungry models by…

Computation and Language · Computer Science 2023-11-14 Seokjin Oh , Su Ah Lee , Woohwan Jung

One major challenge of translating code between programming languages is that parallel training data is often limited. To overcome this challenge, we present two data augmentation techniques, one that builds comparable corpora (i.e., code…

Computation and Language · Computer Science 2024-10-07 Yiqing Xie , Atharva Naik , Daniel Fried , Carolyn Rose

Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences, which can be used to improve performance of many downstream NLP tasks. In this work, we propose a semi-supervised formality…

Computation and Language · Computer Science 2020-10-13 Kunal Chawla , Diyi Yang

The availability of parallel texts is crucial to the performance of machine translation models. However, most of the world's languages face the predominant challenge of data scarcity. In this paper, we propose strategies to synthesize…

Computation and Language · Computer Science 2024-02-06 Md Mahfuz Ibn Alam , Sina Ahmadi , Antonios Anastasopoulos

Text-style transfer aims to convert text given in one domain into another by paraphrasing the sentence or substituting the keywords without altering the content. By necessity, state-of-the-art methods have evolved to accommodate nonparallel…

Computation and Language · Computer Science 2021-06-22 Xing Han , Jessica Lundin

In this paper, we propose a two-phase training approach where pre-trained large language models are continually pre-trained on parallel data and then supervised fine-tuned with a small amount of high-quality parallel data. To investigate…

Computation and Language · Computer Science 2024-07-04 Minato Kondo , Takehito Utsuro , Masaaki Nagata

Large language models (LLMs) have demonstrated impressive translation capabilities even without being explicitly trained on parallel data. This remarkable property has led some to believe that parallel data is no longer necessary for…

Computation and Language · Computer Science 2025-06-17 Muhammad Reza Qorib , Junyi Li , Hwee Tou Ng

Text style transfer without parallel data has achieved some practical success. However, in the scenario where less data is available, these methods may yield poor performance. In this paper, we examine domain adaptation for text style…

Computation and Language · Computer Science 2019-08-27 Dianqi Li , Yizhe Zhang , Zhe Gan , Yu Cheng , Chris Brockett , Ming-Ting Sun , Bill Dolan

Automatic text simplification systems help to reduce textual information barriers on the internet. However, for languages other than English, only few parallel data to train these systems exists. We propose a two-step approach to overcome…

Computation and Language · Computer Science 2023-11-08 Miriam Anschütz , Joshua Oehms , Thomas Wimmer , Bartłomiej Jezierski , Georg Groh

Advances in natural language processing, such as transfer learning from pre-trained language models, have impacted how models are trained for programming language tasks too. Previous research primarily explored code pre-training and…

Computation and Language · Computer Science 2023-02-08 Pinzhen Chen , Gerasimos Lampouras

Style transfer aims to rewrite a source text in a different target style while preserving its content. We propose a novel approach to this task that leverages generic resources, and without using any task-specific parallel (source-target)…

Computation and Language · Computer Science 2021-09-13 Huiyuan Lai , Antonio Toral , Malvina Nissim

The rapid advancement in Large Language Models has been met with significant challenges in their training processes, primarily due to their considerable computational and memory demands. This research examines parallelization techniques…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-27 Ishan Patwardhan , Shubham Gandhi , Om Khare , Amit Joshi , Suraj Sawant
‹ Prev 1 2 3 10 Next ›