Related papers: Mirostat: A Neural Text Decoding Algorithm that Di…

Truncation Sampling as Language Model Desmoothing

Long samples of text from neural language models can be of poor quality. Truncation sampling algorithms--like top-$p$ or top-$k$ -- address this by setting some words' probabilities to zero at each step. This work provides framing for the…

Computation and Language · Computer Science 2022-10-28 John Hewitt , Christopher D. Manning , Percy Liang

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Large language models (LLMs), despite their impressive performance across a wide range of tasks, often struggle to balance two competing objectives in open-ended text generation: fostering diversity and creativity while preserving logical…

Computation and Language · Computer Science 2026-05-12 Erfan Baghaei Potraghloo , Seyedarmin Azizi , Souvik Kundu , Massoud Pedram

DiffSampling: Enhancing Diversity and Accuracy in Neural Text Generation

Despite their growing capabilities, language models still frequently reproduce content from their training data, generate repetitive text, and favor common grammatical patterns and vocabulary. A possible cause is the decoding strategy: the…

Computation and Language · Computer Science 2026-01-15 Giorgio Franceschelli , Mirco Musolesi

Min-$k$ Sampling: Decoupling Truncation from Temperature Scaling via Relative Logit Dynamics

The quality of text generated by large language models depends critically on the decoding sampling strategy. While mainstream methods such as Top-$k$, Top-$p$, and Min-$p$ achieve a balance between diversity and accuracy through…

Artificial Intelligence · Computer Science 2026-04-14 Yuanhao Ding , Meimingwei Li , Esteban Garces Arias , Matthias Aßenmacher , Christian Heumann , Chongsheng Zhang

Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. Popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and…

Computation and Language · Computer Science 2025-11-21 Minh Nhat Nguyen , Andrew Baker , Clement Neo , Allen Roush , Andreas Kirsch , Ravid Shwartz-Ziv

Conformal Nucleus Sampling

Language models generate text based on successively sampling the next word. A decoding procedure based on nucleus (top-$p$) sampling chooses from the smallest possible set of words whose cumulative probability exceeds the probability $p$.…

Computation and Language · Computer Science 2023-05-05 Shauli Ravfogel , Yoav Goldberg , Jacob Goldberger

Entropy-Aligned Decoding of LMs for Better Writing and Reasoning

Language models (LMs) are trained on billions of tokens in an attempt to recover the true language distribution. Still, vanilla random sampling from LMs yields low quality generations. Decoding algorithms attempt to restrict the LM…

Machine Learning · Computer Science 2026-01-06 Kareem Ahmed , Sameer Singh

Language Model Decoding as Direct Metrics Optimization

Despite the remarkable advances in language modeling, current mainstream decoding methods still struggle to generate texts that align with human texts across different aspects. In particular, sampling-based methods produce less-repetitive…

Computation and Language · Computer Science 2024-06-06 Haozhe Ji , Pei Ke , Hongning Wang , Minlie Huang

The Curious Case of Neural Text Degeneration

Despite considerable advancements with deep neural language models, the enigma of neural text degeneration persists when these models are tested as text generators. The counter-intuitive empirical observation is that even though the use of…

Computation and Language · Computer Science 2020-02-18 Ari Holtzman , Jan Buys , Li Du , Maxwell Forbes , Yejin Choi

Top-b: Entropic Regulation of Relative Probability Bands in Autoregressive Language Processes

Probabilistic language generators are theoretically modeled as discrete stochastic processes, yet standard decoding strategies (Top-k, Top-p) impose static truncation rules that fail to accommodate the dynamic information density of natural…

Computation and Language · Computer Science 2026-03-17 Deepon Halder , Raj Dabre

Overfitting Mechanism and Avoidance in Deep Neural Networks

Assisted by the availability of data and high performance computing, deep learning techniques have achieved breakthroughs and surpassed human performance empirically in difficult tasks, including object recognition, speech recognition, and…

Machine Learning · Computer Science 2019-01-23 Shaeke Salman , Xiuwen Liu

A Continuum of Generation Tasks for Investigating Length Bias and Degenerate Repetition

Language models suffer from various degenerate behaviors. These differ between tasks: machine translation (MT) exhibits length bias, while tasks like story generation exhibit excessive repetition. Recent work has attributed the difference…

Computation and Language · Computer Science 2022-10-21 Darcey Riley , David Chiang

REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy

Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity. For example, a higher p threshold in the nucleus (top-p) sampling increases the diversity but…

Computation and Language · Computer Science 2024-06-13 Haw-Shiuan Chang , Nanyun Peng , Mohit Bansal , Anil Ramakrishna , Tagyoung Chung

A Theoretical Analysis of the Repetition Problem in Text Generation

Text generation tasks, including translation, summarization, language models, and etc. see rapid growth during recent years. Despite the remarkable achievements, the repetition problem has been observed in nearly all text generation models…

Computation and Language · Computer Science 2021-03-23 Zihao Fu , Wai Lam , Anthony Man-Cho So , Bei Shi

On the Entropy Calibration of Language Models

We study the problem of entropy calibration, which asks whether a language model's entropy over generations matches its log loss on human text. Past work found that models are miscalibrated, with entropy per step increasing as generations…

Computation and Language · Computer Science 2026-01-14 Steven Cao , Gregory Valiant , Percy Liang

Trading Off Diversity and Quality in Natural Language Generation

For open-ended language generation tasks such as storytelling and dialogue, choosing the right decoding algorithm is critical to controlling the tradeoff between generation quality and diversity. However, there presently exists no consensus…

Computation and Language · Computer Science 2020-04-23 Hugh Zhang , Daniel Duckworth , Daphne Ippolito , Arvind Neelakantan

Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating Attention

Recently, powerful Transformer architectures have proven superior in generating high-quality sentences. Nevertheless, these models tend to produce dull high-frequency phrases, severely hurting the diversity and novelty of generated text. In…

Computation and Language · Computer Science 2022-11-15 Wenhao Li , Xiaoyuan Yi , Jinyi Hu , Maosong Sun , Xing Xie

Improving Diversity of Neural Text Generation via Inverse Probability Weighting

The neural text generation suffers from the text degeneration issue such as repetition. Traditional stochastic sampling methods only focus on truncating the unreliable "tail" of the distribution, and do not address the "head" part, which we…

Computation and Language · Computer Science 2021-08-27 Xinran Zhang , Maosong Sun , Jiafeng Liu , Xiaobing Li

Local Normalization Distortion and the Thermodynamic Formalism of Decoding Strategies for Large Language Models

Advances in hardware and language model architecture have spurred a revolution in natural language generation. However, autoregressive models compute probability distributions over next-token choices, and sampling from these distributions,…

Computation and Language · Computer Science 2025-09-10 Tom Kempton , Stuart Burrell

p-less Sampling: A Robust Hyperparameter-Free Approach for LLM Decoding

Obtaining high-quality outputs from Large Language Models (LLMs) often depends upon the choice of a sampling-based decoding strategy to probabilistically choose the next token at each generation step. While a variety of such sampling…

Artificial Intelligence · Computer Science 2026-03-02 Runyan Tan , Shuang Wu , Phillip Howard