English
Related papers

Related papers: Entropy-UID: A Method for Optimizing Information D…

200 papers

Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data required for fine-tuning LMs. However, their…

Computation and Language · Computer Science 2026-02-20 Hongming Li , Yang Liu , Chao Huang

Current language models decode text token by token according to probabilistic distribution, and determining the appropriate candidates for the next token is crucial to ensure generation quality. This study introduces adaptive decoding, a…

Computation and Language · Computer Science 2024-06-04 Wenhong Zhu , Hongkun Hao , Zhiwei He , Yiming Ai , Rui Wang

Large language models (LLMs) achieve remarkable generative performance, yet their output quality is dependent on the decoding strategy. While sampling-based methods (e.g., top-k, nucleus) and search-and-select based methods (e.g., beam…

Machine Learning · Computer Science 2026-05-12 Benjamin Patrick Evans , Sumitra Ganesh , Leo Ardon

The uniform information density (UID) hypothesis states that humans tend to distribute information roughly evenly across an utterance or discourse. Early evidence in support of the UID hypothesis came from Genzel & Charniak (2002), which…

Computation and Language · Computer Science 2023-10-19 Vivek Verma , Nicholas Tomlin , Dan Klein

Humans tend to follow the Uniform Information Density (UID) principle by distributing information evenly in utterances. We study if decoding algorithms implicitly follow this UID principle, and under what conditions adherence to UID might…

Computation and Language · Computer Science 2023-03-31 Saranya Venkatraman , He He , David Reitter

The Uniform Information Density (UID) hypothesis proposes that effective communication is achieved by maintaining a stable flow of information. In this work, we revisit this principle in the context of Large Language Model (LLM) reasoning,…

Artificial Intelligence · Computer Science 2026-04-20 Minju Gwak , Guijin Son , Jaehyung Kim

Information theoretic quantities play a central role in machine learning. The recent surge in the complexity of data and models has increased the demand for accurate estimation of these quantities. However, as the dimension grows the…

Machine Learning · Statistics 2024-05-21 Viktor Nilsson , Anirban Samaddar , Sandeep Madireddy , Pierre Nyquist

Large language models (LLMs) often solve problems using step-by-step Chain-of-Thought (CoT) reasoning, yet these intermediate steps are frequently unfaithful or hard to interpret. Inspired by the Uniform Information Density (UID) hypothesis…

Computation and Language · Computer Science 2025-10-21 Minju Gwak , Guijin Son , Jaehyung Kim

Token sampling strategies critically influence text generation quality in large language models (LLMs). However, existing methods introduce additional hyperparameters, requiring extensive tuning and complicating deployment. We present…

Computation and Language · Computer Science 2025-12-02 Xiaodong Cai , Hai Lin , Shaoxiong Zhan , Weiqi Luo , Hong-Gee Kim , Hongyan Hao , Yu Yang , Hai-Tao Zheng

The Uniform Information Density (UID) principle posits that humans prefer to spread information evenly during language production. We examine if this UID principle can help capture differences between Large Language Models (LLMs)-generated…

Computation and Language · Computer Science 2024-04-05 Saranya Venkatraman , Adaku Uchendu , Dongwon Lee

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

We present Entropy Adaptive Decoding (EAD), a novel approach for efficient language model inference that dynamically switches between different-sized models based on prediction uncertainty. By monitoring rolling entropy in model logit…

Machine Learning · Computer Science 2025-02-12 Toby Simonds

Multimodal reward models are crucial for aligning multimodal large language models with human preferences. Recent works have incorporated reasoning capabilities into these models, achieving promising results. However, training these models…

Artificial Intelligence · Computer Science 2026-02-03 Shidong Yang , Tongwen Huang , Hao Wen , Yong Wang , Li Chen , Xiangxiang Chu

The uniform information density (UID) hypothesis posits a preference among language users for utterances structured such that information is distributed uniformly across a signal. While its implications on language production have been well…

Computation and Language · Computer Science 2021-09-27 Clara Meister , Tiago Pimentel , Patrick Haller , Lena Jäger , Ryan Cotterell , Roger Levy

Retrieval-augmented generation integrates the capabilities of large language models with relevant information retrieved from an extensive corpus, yet encounters challenges when confronted with real-world noisy data. One recent solution is…

Computation and Language · Computer Science 2025-09-30 Kun Zhu , Xiaocheng Feng , Xiyuan Du , Yuxuan Gu , Weijiang Yu , Haotian Wang , Qianglong Chen , Zheng Chu , Jingchang Chen , Bing Qin

Entropy estimation is of practical importance in information theory and statistical science. Many existing entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for…

Information Theory · Computer Science 2023-08-22 Ziqiao Ao , Jinglai Li

During a spontaneous change, a macroscopic physical system will evolve towards a macro-state with more realizations. This observation is at the basis of the Statistical Mechanical version of the Second Law of Thermodynamics, and it provides…

Statistical Mechanics · Physics 2020-04-22 Mengjie Zu , Arunkumar Bupathy , Daan Frenkel , Srikanth Sastry

Open-ended text generation faces a critical challenge: balancing coherence with diversity in LLM outputs. While contrastive search-based decoding strategies have emerged to address this trade-off, their practical utility is often limited by…

Diffusion models have garnered considerable interest in the field of text generation. Several studies have explored text diffusion models with different structures and applied them to various tasks, including named entity recognition and…

Computation and Language · Computer Science 2023-10-19 Renzhi Wang , Jing Li , Piji Li

This paper presents a novel class of information-theoretic strategies for solving the game of Mastermind, achieving state-of-the-art performance among known heuristic methods. The core contribution is the application of a weighted entropy…

Information Theory · Computer Science 2025-11-26 Serkan Gür
‹ Prev 1 2 3 10 Next ›