Related papers: On the Discrepancy between Density Estimation and …

Sequence Level Training with Recurrent Neural Networks

Many natural language processing applications use language models to generate text. These models are typically trained to predict the next word in a sequence, given the previous words and some context such as an image. However, at test time…

Machine Learning · Computer Science 2016-05-10 Marc'Aurelio Ranzato , Sumit Chopra , Michael Auli , Wojciech Zaremba

Predicting Through Generation: Why Generation Is Better for Prediction

This paper argues that generating output tokens is more effective than using pooled representations for prediction tasks because token-level generation retains more mutual information. Since LLMs are trained on massive text corpora using…

Computation and Language · Computer Science 2025-05-28 Md Kowsher , Nusrat Jahan Prottasha , Prakash Bhat , Chun-Nam Yu , Mojtaba Soltanalian , Ivan Garibay , Ozlem Garibay , Chen Chen , Niloofar Yousefi

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Auto-regressive sequence generative models trained by Maximum Likelihood Estimation suffer the exposure bias problem in practical finite sample scenarios. The crux is that the number of training samples for Maximum Likelihood Estimation is…

Machine Learning · Statistics 2020-07-14 Yuxuan Song , Ning Miao , Hao Zhou , Lantao Yu , Mingxuan Wang , Lei Li

Sequence Modeling with Unconstrained Generation Order

The dominant approach to sequence generation is to produce a sequence in some predefined order, e.g. left to right. In contrast, we propose a more general model that can generate the output sequence by inserting tokens in any arbitrary…

Computation and Language · Computer Science 2019-11-04 Dmitrii Emelianenko , Elena Voita , Pavel Serdyukov

Approximate Distribution Matching for Sequence-to-Sequence Learning

Sequence-to-Sequence models were introduced to tackle many real-life problems like machine translation, summarization, image captioning, etc. The standard optimization algorithms are mainly based on example-to-example matching like maximum…

Computation and Language · Computer Science 2018-09-05 Wenhu Chen , Guanlin Li , Shujie Liu , Zhirui Zhang , Mu Li , Ming Zhou

LED: Latent Variable-based Estimation of Density

Modern generative models are roughly divided into two main categories: (1) models that can produce high-quality random samples, but cannot estimate the exact density of new data points and (2) those that provide exact density estimation, at…

Machine Learning · Computer Science 2022-06-24 Omri Ben-Dov , Pravir Singh Gupta , Victoria Fernandez Abrevaya , Michael J. Black , Partha Ghosh

Jointly Measuring Diversity and Quality in Text Generation Models

Text generation is an important Natural Language Processing task with various applications. Although several metrics have already been introduced to evaluate the text generation methods, each of them has its own shortcomings. The most…

Machine Learning · Computer Science 2019-05-22 Ehsan Montahaei , Danial Alihosseini , Mahdieh Soleymani Baghshah

Sensitivity as a Complexity Measure for Sequence Classification Tasks

We introduce a theoretical framework for understanding and predicting the complexity of sequence classification tasks, using a novel extension of the theory of Boolean function sensitivity. The sensitivity of a function, given a…

Computation and Language · Computer Science 2021-04-22 Michael Hahn , Dan Jurafsky , Richard Futrell

Sequence-to-Sequence Models for Data-to-Text Natural Language Generation: Word- vs. Character-based Processing and Output Diversity

We present a comparison of word-based and character-based sequence-to-sequence models for data-to-text natural language generation, which generate natural language descriptions for structured inputs. On the datasets of two recent generation…

Computation and Language · Computer Science 2018-10-12 Glorianna Jagfeld , Sabrina Jenne , Ngoc Thang Vu

Sequence-to-Sequence Learning with Latent Neural Grammars

Sequence-to-sequence learning with neural networks has become the de facto standard for sequence prediction tasks. This approach typically models the local distribution over the next word with a powerful neural network that can condition on…

Computation and Language · Computer Science 2021-11-17 Yoon Kim

BLEU Neighbors: A Reference-less Approach to Automatic Evaluation

Evaluation is a bottleneck in the development of natural language generation (NLG) models. Automatic metrics such as BLEU rely on references, but for tasks such as open-ended generation, there are no references to draw upon. Although…

Computation and Language · Computer Science 2020-10-14 Kawin Ethayarajh , Dorsa Sadigh

An online sequence-to-sequence model for noisy speech recognition

Generative models have long been the dominant approach for speech recognition. The success of these models however relies on the use of sophisticated recipes and complicated machinery that is not easily accessible to non-practitioners.…

Computation and Language · Computer Science 2017-06-21 Chung-Cheng Chiu , Dieterich Lawson , Yuping Luo , George Tucker , Kevin Swersky , Ilya Sutskever , Navdeep Jaitly

Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation

The advent of large language models (LLMs) has dramatically advanced the state-of-the-art in numerous natural language generation tasks. For LLMs to be applied reliably, it is essential to have an accurate measure of their confidence.…

Computation and Language · Computer Science 2024-06-05 Zhen Lin , Shubhendu Trivedi , Jimeng Sun

Local Explanation of Dialogue Response Generation

In comparison to the interpretation of classification models, the explanation of sequence generation models is also an important problem, however it has seen little attention. In this work, we study model-agnostic explanations of a…

Computation and Language · Computer Science 2022-02-08 Yi-Lin Tuan , Connor Pryor , Wenhu Chen , Lise Getoor , William Yang Wang

Comparison Study Between Token Classification and Sequence Classification In Text Classification

Unsupervised Machine Learning techniques have been applied to Natural Language Processing tasks and surpasses the benchmarks such as GLUE with great success. Building language models approach achieves good results in one language and it can…

Computation and Language · Computer Science 2022-11-28 Amir Jafari

Constructing a Natural Language Inference Dataset using Generative Neural Networks

Natural Language Inference is an important task for Natural Language Understanding. It is concerned with classifying the logical relation between two sentences. In this paper, we propose several text generative neural networks for…

Artificial Intelligence · Computer Science 2017-03-28 Janez Starc , Dunja Mladenić

Modeling Confidence in Sequence-to-Sequence Models

Recently, significant improvements have been achieved in various natural language processing tasks using neural sequence-to-sequence models. While aiming for the best generation quality is important, ultimately it is also necessary to…

Computation and Language · Computer Science 2019-10-07 Jan Niehues , Ngoc-Quan Pham

Evaluation Metrics of Language Generation Models for Synthetic Traffic Generation Tasks

Many Natural Language Generation (NLG) tasks aim to generate a single output text given an input prompt. Other settings require the generation of multiple texts, e.g., for Synthetic Traffic Generation (STG). This generation task is crucial…

Computation and Language · Computer Science 2023-11-22 Simone Filice , Jason Ingyu Choi , Giuseppe Castellucci , Eugene Agichtein , Oleg Rokhlenko

Bounding the Test Log-Likelihood of Generative Models

Several interesting generative learning algorithms involve a complex probability distribution over many random variables, involving intractable normalization constants or latent variable normalization. Some of them may even not have an…

Machine Learning · Computer Science 2014-05-13 Yoshua Bengio , Li Yao , Kyunghyun Cho

Large-scale cloze evaluation reveals that token prediction tasks are neither lexically nor semantically aligned

In this work we compare the generative behavior at the next token prediction level in several language models by comparing them to human productions in the cloze task. We find that while large models trained for longer are typically better…

Computation and Language · Computer Science 2024-10-29 Cassandra L. Jacobs , Loïc Grobol , Alvin Tsang