Related papers: Text Simplification by Tagging

Text Generation with Text-Editing Models

Text-editing models have recently become a prominent alternative to seq2seq models for monolingual text-generation tasks such as grammatical error correction, simplification, and style transfer. These tasks share a common trait - they…

Computation and Language · Computer Science 2022-06-15 Eric Malmi , Yue Dong , Jonathan Mallinson , Aleksandr Chuklin , Jakub Adamek , Daniil Mirylenka , Felix Stahlberg , Sebastian Krause , Shankar Kumar , Aliaksei Severyn

Efficient Pre-Training with Token Superposition

Pre-training of Large Language Models is often prohibitively expensive and inefficient at scale, requiring complex and invasive modifications in order to achieve high data throughput. In this work, we present Token-Superposition Training…

Computation and Language · Computer Science 2026-05-20 Bowen Peng , Théo Gigant , Jeffrey Quesnelle

On Text Style Transfer via Style Masked Language Models

Text Style Transfer (TST) is performable through approaches such as latent space disentanglement, cycle-consistency losses, prototype editing etc. The prototype editing approach, which is known to be quite successful in TST, involves two…

Computation and Language · Computer Science 2022-10-13 Sharan Narasimhan , Pooja Shekar , Suvodip Dey , Maunendra Sankar Desarkar

Controlling Pre-trained Language Models for Grade-Specific Text Simplification

Text simplification (TS) systems rewrite text to make it more readable while preserving its content. However, what makes a text easy to read depends on the intended readers. Recent work has shown that pre-trained language models can…

Computation and Language · Computer Science 2023-12-01 Sweta Agrawal , Marine Carpuat

Consecutive Decoding for Speech-to-text Translation

Speech-to-text translation (ST), which directly translates the source language speech to the target language text, has attracted intensive attention recently. However, the combination of speech recognition and machine translation in a…

Computation and Language · Computer Science 2022-04-18 Qianqian Dong , Mingxuan Wang , Hao Zhou , Shuang Xu , Bo Xu , Lei Li

Transforming Sequence Tagging Into A Seq2Seq Task

Pretrained, large, generative language models (LMs) have had great success in a wide range of sequence tagging and structured prediction tasks. Casting a sequence tagging task as a Seq2Seq one requires deciding the formats of the input and…

Computation and Language · Computer Science 2022-10-26 Karthik Raman , Iftekhar Naim , Jiecao Chen , Kazuma Hashimoto , Kiran Yalasangi , Krishna Srinivasan

Controllable Text Simplification with Explicit Paraphrasing

Text Simplification improves the readability of sentences through several rewriting transformations, such as lexical paraphrasing, deletion, and splitting. Current simplification systems are predominantly sequence-to-sequence models that…

Computation and Language · Computer Science 2021-04-16 Mounica Maddela , Fernando Alva-Manchego , Wei Xu

LST: Lexicon-Guided Self-Training for Few-Shot Text Classification

Self-training provides an effective means of using an extremely small amount of labeled data to create pseudo-labels for unlabeled data. Many state-of-the-art self-training approaches hinge on different regularization methods to prevent…

Computation and Language · Computer Science 2022-02-08 Hazel Kim , Jaeman Son , Yo-Sub Han

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding

Speech-to-text translation (ST), which translates source language speech into target language text, has attracted intensive attention in recent years. Compared to the traditional pipeline system, the end-to-end ST model has potential…

Computation and Language · Computer Science 2019-12-17 Yuchen Liu , Jiajun Zhang , Hao Xiong , Long Zhou , Zhongjun He , Hua Wu , Haifeng Wang , Chengqing Zong

(Psycho-)Linguistic Features Meet Transformer Models for Improved Explainable and Controllable Text Simplification

State-of-the-art text simplification (TS) systems adopt end-to-end neural network models to directly generate the simplified version of the input text, and usually function as a blackbox. Moreover, TS is usually treated as an all-purpose…

Computation and Language · Computer Science 2022-12-21 Yu Qiao , Xiaofei Li , Daniel Wiechmann , Elma Kerz

TST$^\mathrm{R}$: Target Similarity Tuning Meets the Real World

Target similarity tuning (TST) is a method of selecting relevant examples in natural language (NL) to code generation through large language models (LLMs) to improve performance. Its goal is to adapt a sentence embedding model to have the…

Artificial Intelligence · Computer Science 2023-10-31 Anirudh Khatry , Sumit Gulwani , Priyanshu Gupta , Vu Le , Ananya Singha , Mukul Singh , Gust Verbruggen

REST: Retrieval-Based Speculative Decoding

We introduce Retrieval-Based Speculative Decoding (REST), a novel algorithm designed to speed up language model generation. The key insight driving the development of REST is the observation that the process of text generation often…

Computation and Language · Computer Science 2024-04-05 Zhenyu He , Zexuan Zhong , Tianle Cai , Jason D. Lee , Di He

Towards Unsupervised Speech-to-Text Translation

We present a framework for building speech-to-text translation (ST) systems using only monolingual speech and text corpora, in other words, speech utterances from a source language and independent text from a target language. As opposed to…

Computation and Language · Computer Science 2018-11-06 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

Latent Speech-Text Transformer

Auto-regressive speech-text models pre-trained on interleaved text tokens and discretized speech tokens demonstrate strong speech understanding and generation, yet remain substantially less compute-efficient than text LLMs, partly due to…

Computation and Language · Computer Science 2026-03-11 Yen-Ju Lu , Yashesh Gaur , Wei Zhou , Benjamin Muller , Jesus Villalba , Najim Dehak , Luke Zettlemoyer , Gargi Ghosh , Mike Lewis , Srinivasan Iyer , Duc Le

Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders

Text simplification (TS) rephrases long sentences into simplified variants while preserving inherent semantics. Traditional sequence-to-sequence models heavily rely on the quantity and quality of parallel sentences, which limits their…

Computation and Language · Computer Science 2020-05-01 Yanbin Zhao , Lu Chen , Zhi Chen , Kai Yu

Felix: Flexible Text Editing Through Tagging and Insertion

We present Felix --- a flexible text-editing approach for generation, designed to derive the maximum benefit from the ideas of decoding with bi-directional contexts and self-supervised pre-training. In contrast to conventional…

Computation and Language · Computer Science 2020-03-25 Jonathan Mallinson , Aliaksei Severyn , Eric Malmi , Guillermo Garrido

Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification

Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement…

Computation and Language · Computer Science 2019-04-08 Reno Kriz , João Sedoc , Marianna Apidianaki , Carolina Zheng , Gaurav Kumar , Eleni Miltsakaki , Chris Callison-Burch

Distilling Text Style Transfer With Self-Explanation From LLMs

Text Style Transfer (TST) seeks to alter the style of text while retaining its core content. Given the constraints of limited parallel datasets for TST, we propose CoTeX, a framework that leverages large language models (LLMs) alongside…

Computation and Language · Computer Science 2024-05-07 Chiyu Zhang , Honglong Cai , Yuezhang , Li , Yuexin Wu , Le Hou , Muhammad Abdul-Mageed

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer

Text spotting end-to-end methods have recently gained attention in the literature due to the benefits of jointly optimizing the text detection and recognition components. Existing methods usually have a distinct separation between the…

Computer Vision and Pattern Recognition · Computer Science 2022-02-15 Yair Kittenplon , Inbal Lavi , Sharon Fogel , Yarin Bar , R. Manmatha , Pietro Perona

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

Attention-based sequence-to-sequence modeling provides a powerful and elegant solution for applications that need to map one sequence to a different sequence. Its success heavily relies on the availability of large amounts of training data.…

Computation and Language · Computer Science 2021-02-12 Yun Tang , Juan Pino , Changhan Wang , Xutai Ma , Dmitriy Genzel