Related papers: Entropy-UID: A Method for Optimizing Information D…

Entropy-Based Data Selection for Language Models

Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data required for fine-tuning LMs. However, their…

Computation and Language · Computer Science 2026-02-20 Hongming Li , Yang Liu , Chao Huang

Improving Open-Ended Text Generation via Adaptive Decoding

Current language models decode text token by token according to probabilistic distribution, and determining the appropriate candidates for the next token is crucial to ensure generation quality. This study introduces adaptive decoding, a…

Computation and Language · Computer Science 2024-06-04 Wenhong Zhu , Hongkun Hao , Zhiwei He , Yiming Ai , Rui Wang

Entropy-informed Decoding: Adaptive Information-Driven Branching

Large language models (LLMs) achieve remarkable generative performance, yet their output quality is dependent on the decoding strategy. While sampling-based methods (e.g., top-k, nucleus) and search-and-select based methods (e.g., beam…

Machine Learning · Computer Science 2026-05-12 Benjamin Patrick Evans , Sumitra Ganesh , Leo Ardon

Revisiting Entropy Rate Constancy in Text

The uniform information density (UID) hypothesis states that humans tend to distribute information roughly evenly across an utterance or discourse. Early evidence in support of the UID hypothesis came from Genzel & Charniak (2002), which…

Computation and Language · Computer Science 2023-10-19 Vivek Verma , Nicholas Tomlin , Dan Klein

How do decoding algorithms distribute information in dialogue responses?

Humans tend to follow the Uniform Information Density (UID) principle by distributing information evenly in utterances. We study if decoding algorithms implicitly follow this UID principle, and under what conditions adherence to UID might…

Computation and Language · Computer Science 2023-03-31 Saranya Venkatraman , He He , David Reitter

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning

The Uniform Information Density (UID) hypothesis proposes that effective communication is achieved by maintaining a stable flow of information. In this work, we revisit this principle in the context of Large Language Model (LLM) reasoning,…

Artificial Intelligence · Computer Science 2026-04-20 Minju Gwak , Guijin Son , Jaehyung Kim

REMEDI: Corrective Transformations for Improved Neural Entropy Estimation

Information theoretic quantities play a central role in machine learning. The recent surge in the complexity of data and models has increased the demand for accurate estimation of these quantities. However, as the dimension grows the…

Machine Learning · Statistics 2024-05-21 Viktor Nilsson , Anirban Samaddar , Sandeep Madireddy , Pierre Nyquist

Revisiting the UID Hypothesis in LLM Reasoning Traces

Large language models (LLMs) often solve problems using step-by-step Chain-of-Thought (CoT) reasoning, yet these intermediate steps are frequently unfaithful or hard to interpret. Inspired by the Uniform Information Density (UID) hypothesis…

Computation and Language · Computer Science 2025-10-21 Minju Gwak , Guijin Son , Jaehyung Kim

Auxiliary-Hyperparameter-Free Sampling: Entropy Equilibrium for Text Generation

Token sampling strategies critically influence text generation quality in large language models (LLMs). However, existing methods introduce additional hyperparameters, requiring extensive tuning and complicating deployment. We present…

Computation and Language · Computer Science 2025-12-02 Xiaodong Cai , Hai Lin , Shaoxiong Zhan , Weiqi Luo , Hong-Gee Kim , Hongyan Hao , Yu Yang , Hai-Tao Zheng

GPT-who: An Information Density-based Machine-Generated Text Detector

The Uniform Information Density (UID) principle posits that humans prefer to spread information evenly during language production. We examine if this UID principle can help capture differences between Large Language Models (LLMs)-generated…

Computation and Language · Computer Science 2024-04-05 Saranya Venkatraman , Adaku Uchendu , Dongwon Lee

Think Twice Before You Write -- an Entropy-based Decoding Strategy to Enhance LLM Reasoning

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

Computation and Language · Computer Science 2026-04-02 Jiashu He , Meizhu Liu , Olaitan P Olaleye , Amit Agarwal , M. Avendi , Yassi Abbasi , Matthew Rowe , Hitesh Laxmichand Patel , Paul Li , Tao Sheng , Sujith Ravi , Dan Roth

Entropy Adaptive Decoding: Dynamic Model Switching for Efficient Inference

We present Entropy Adaptive Decoding (EAD), a novel approach for efficient language model inference that dynamically switches between different-sized models based on prediction uncertainty. By monitoring rolling entropy in model logit…

Machine Learning · Computer Science 2025-02-12 Toby Simonds

Entropy-Guided Data-Efficient Training for Multimodal Reasoning Reward Models

Multimodal reward models are crucial for aligning multimodal large language models with human preferences. Recent works have incorporated reasoning capabilities into these models, achieving promising results. However, training these models…

Artificial Intelligence · Computer Science 2026-02-03 Shidong Yang , Tongwen Huang , Hao Wen , Yong Wang , Li Chen , Xiangxiang Chu

Revisiting the Uniform Information Density Hypothesis

The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal. While its implications on language production have been well…

Computation and Language · Computer Science 2021-09-27 Clara Meister , Tiago Pimentel , Patrick Haller , Lena Jäger , Ryan Cotterell , Roger Levy

An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation

Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is…

Computation and Language · Computer Science 2025-09-30 Kun Zhu , Xiaocheng Feng , Xiyuan Du , Yuxuan Gu , Weijiang Yu , Haotian Wang , Qianglong Chen , Zheng Chu , Jingchang Chen , Bing Qin

Entropy Estimation via Uniformization

Entropy estimation is of practical importance in information theory and statistical science. Many existing entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for…

Information Theory · Computer Science 2023-08-22 Ziqiao Ao , Jinglai Li

Information density, structure and entropy in equilibrium and non-equilibrium systems

During a spontaneous change, a macroscopic physical system will evolve towards a macro-state with more realizations. This observation is at the basis of the Statistical Mechanical version of the Second Law of Thermodynamics, and it provides…

Statistical Mechanics · Physics 2020-04-22 Mengjie Zu , Arunkumar Bupathy , Daan Frenkel , Srikanth Sastry

GUARD: Glocal Uncertainty-Aware Robust Decoding for Effective and Efficient Open-Ended Text Generation

Open-ended text generation faces a critical challenge: balancing coherence with diversity in LLM outputs. While contrastive search-based decoding strategies have emerged to address this trade-off, their practical utility is often limited by…

Computation and Language · Computer Science 2025-09-04 Yuanhao Ding , Esteban Garces Arias , Meimingwei Li , Julian Rodemann , Matthias Aßenmacher , Danlu Chen , Gaojuan Fan , Christian Heumann , Chongsheng Zhang

InfoDiffusion: Information Entropy Aware Diffusion Process for Non-Autoregressive Text Generation

Diffusion models have garnered considerable interest in the field of text generation. Several studies have explored text diffusion models with different structures and applied them to various tasks, including named entity recognition and…

Computation and Language · Computer Science 2023-10-19 Renzhi Wang , Jing Li , Piji Li

The Quality of Information: A Weighted Entropy Approach to Near-Optimal Mastermind

This paper presents a novel class of information-theoretic strategies for solving the game of Mastermind, achieving state-of-the-art performance among known heuristic methods. The core contribution is the application of a weighted entropy…

Information Theory · Computer Science 2025-11-26 Serkan Gür