Related papers: Sparse Text Generation

Learning Sparse Prototypes for Text Generation

Prototype-driven text generation uses non-parametric models that first choose from a library of sentence "prototypes" and then modify the prototype to generate the output text. While effective, these methods are inefficient at test time as…

Computation and Language · Computer Science 2020-11-05 Junxian He , Taylor Berg-Kirkpatrick , Graham Neubig

SparseGAN: Sparse Generative Adversarial Network for Text Generation

It is still a challenging task to learn a neural text generation model under the framework of generative adversarial networks (GANs) since the entire training process is not differentiable. The existing training strategies either suffer…

Computation and Language · Computer Science 2023-07-25 Liping Yuan , Jiehang Zeng , Xiaoqing Zheng

Smoothing and Shrinking the Sparse Seq2Seq Search Space

Current sequence-to-sequence models are trained to minimize cross-entropy and use softmax to compute the locally normalized probabilities over target sequences. While this setup has led to strong results in a variety of tasks, one…

Computation and Language · Computer Science 2021-03-19 Ben Peters , André F. T. Martins

Locally Typical Sampling

Today's probabilistic language generators fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics, e.g., perplexity. This discrepancy has puzzled the…

Computation and Language · Computer Science 2025-06-06 Clara Meister , Tiago Pimentel , Gian Wiher , Ryan Cotterell

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the…

Computation and Language · Computer Science 2026-01-15 Giorgio Franceschelli , Mirco Musolesi

Sparse Sequence-to-Sequence Models

Sequence-to-sequence models are a powerful workhorse of NLP. Most variants employ a softmax transformation in both their attention mechanism and output layer, leading to dense alignments and strictly positive output probabilities. This…

Computation and Language · Computer Science 2019-06-14 Ben Peters , Vlad Niculae , André F. T. Martins

Long-Context Generalization with Sparse Attention

Transformer-based architectures traditionally employ softmax to compute attention weights, which produces dense distributions over all tokens in a sequence. While effective in many settings, this density has been shown to be detrimental for…

Computation and Language · Computer Science 2026-03-03 Pavlo Vasylenko , Hugo Pitorro , André F. T. Martins , Marcos Treviso

SESCORE2: Learning Text Generation Evaluation via Synthesizing Realistic Mistakes

Is it possible to train a general metric for evaluating text generation quality without human annotated ratings? Existing learned metrics either perform unsatisfactorily across text generation tasks or require human ratings for training on…

Computation and Language · Computer Science 2023-07-10 Wenda Xu , Xian Qian , Mingxuan Wang , Lei Li , William Yang Wang

Modern Methods for Text Generation

Synthetic text generation is challenging and has limited success. Recently, a new architecture, called Transformers, allow machine learning models to understand better sequential data, such as translation or summarization. BERT and GPT-2,…

Computation and Language · Computer Science 2020-09-11 Dimas Munoz Montesinos

F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Despite recent advances in neural text generation, encoding the rich diversity in human language remains elusive. We argue that the sub-optimal text generation is mainly attributable to the imbalanced token distribution, which particularly…

Computation and Language · Computer Science 2020-10-06 Byung-Ju Choi , Jimin Hong , David Keetae Park , Sang Wan Lee

On the Efficacy of Sampling Adapters

Sampling is a common strategy for generating text from probabilistic models, yet standard ancestral sampling often results in text that is incoherent or ungrammatical. To alleviate this issue, various modifications to a model's sampling…

Computation and Language · Computer Science 2024-01-08 Clara Meister , Tiago Pimentel , Luca Malagutti , Ethan G. Wilcox , Ryan Cotterell

MaskGAN: Better Text Generation via Filling in the______

Neural text generation models are often autoregressive language models or seq2seq models. These models generate text by sampling words sequentially, with each word conditioned on the previous word, and are state-of-the-art for several…

Machine Learning · Statistics 2018-03-02 William Fedus , Ian Goodfellow , Andrew M. Dai

Correction of Errors in Preference Ratings from Automated Metrics for Text Generation

A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments. In this paper, we propose a statistical model of Text…

Computation and Language · Computer Science 2023-06-07 Jan Deriu , Pius von Däniken , Don Tuggener , Mark Cieliebak

On the Blind Spots of Model-Based Evaluation Metrics for Text Generation

In this work, we explore a useful but often neglected methodology for robustness analysis of text generation evaluation metrics: stress tests with synthetic data. Basically, we design and synthesize a wide range of potential errors and…

Computation and Language · Computer Science 2023-05-22 Tianxing He , Jingyu Zhang , Tianle Wang , Sachin Kumar , Kyunghyun Cho , James Glass , Yulia Tsvetkov

Neural Text Generation: A Practical Guide

Deep learning methods have recently achieved great empirical success on machine translation, dialogue response generation, summarization, and other text generation tasks. At a high level, the technique has been to train end-to-end neural…

Computation and Language · Computer Science 2017-11-28 Ziang Xie

Paraphrasing with Large Language Models

Recently, large language models such as GPT-2 have shown themselves to be extremely adept at text generation and have also been able to achieve high-quality results in many downstream NLP tasks such as text classification, sentiment…

Computation and Language · Computer Science 2019-11-22 Sam Witteveen , Martin Andrews

Jointly Measuring Diversity and Quality in Text Generation Models

Text generation is an important Natural Language Processing task with various applications. Although several metrics have already been introduced to evaluate the text generation methods, each of them has its own shortcomings. The most…

Machine Learning · Computer Science 2019-05-22 Ehsan Montahaei , Danial Alihosseini , Mahdieh Soleymani Baghshah

Speeding Up Entmax

Softmax is the de facto standard in modern neural networks for language processing when it comes to normalizing logits. However, by producing a dense probability distribution each token in the vocabulary has a nonzero chance of being…

Computation and Language · Computer Science 2022-05-20 Maxat Tezekbayev , Vassilina Nikoulina , Matthias Gallé , Zhenisbek Assylbekov

Language Model Evaluation in Open-ended Text Generation

Although current state-of-the-art language models have achieved impressive results in numerous natural language processing tasks, still they could not solve the problem of producing repetitive, dull and sometimes inconsistent text in…

Computation and Language · Computer Science 2021-08-10 An Nguyen

Improving Language Generation with Sentence Coherence Objective

Conditional story generation and contextual text continuation have become increasingly popular topics in NLP community. Existing models are often prone to output paragraphs of texts that gradually diverge from the given prompt. Although the…

Computation and Language · Computer Science 2020-09-15 Ruixiao Sun , Jie Yang , Mehrdad Yousefzadeh