English
Related papers

Related papers: Entropy-Guided Reasoning Compression

200 papers

Large Reasoning Models (LRMs) often suffer from overthinking, generating unnecessarily long reasoning chains even for simple tasks. This leads to substantial computational overhead with limited performance gain, primarily due to redundant…

Artificial Intelligence · Computer Science 2026-01-13 Ruichu Cai , Haopeng Du , Qingwen Lin , Yutong Chen , Zijian Li , Boyan Xu

Large Language Models (LLMs) using Chain-of-Thought (CoT) prompting excel at complex reasoning but generate verbose thought processes with considerable redundancy, leading to increased inference costs and reduced efficiency. We introduce a…

Artificial Intelligence · Computer Science 2026-02-17 Zeju Li , Jianyuan Zhong , Ziyang Zheng , Xiangyu Wen , Zhijian Xu , Yingying Cheng , Fan Zhang , Qiang Xu

Chain-of-thought prompting has emerged as a powerful technique for enabling large language models (LLMs) to solve complex reasoning tasks. However, these reasoning chains can be verbose, raising concerns about efficiency. In response,…

Computation and Language · Computer Science 2025-04-02 Ayeong Lee , Ethan Che , Tianyi Peng

Large Reasoning Models (LRMs) excel at complex reasoning tasks through extended chain-of-thought generation, but their reliance on lengthy intermediate steps incurs substantial computational cost. We find that the entropy of the model's…

Artificial Intelligence · Computer Science 2026-02-02 Hongxi Yan , Qingjie Liu , Yunhong Wang

Reasoning in Large Language Models incurs significant inference-time compute, yet the token-level information structure of reasoning traces remains underexplored. We observe that reasoning tokens split into two functional types: low-entropy…

Computation and Language · Computer Science 2026-05-06 Zhenyu Zhao , Sander Land , Daniel M. Bikel , Waseem Alshikh

Frontier reasoning models are produced by posttraining base language models with reinforcement learning. Recent work has challenged this by showing that sampling from a sharpened version of the base model's distribution, a so-called power…

Machine Learning · Computer Science 2026-05-29 Felix Zhou , Anay Mehrotra , Quanquan C. Liu

Multimodal reward models are crucial for aligning multimodal large language models with human preferences. Recent works have incorporated reasoning capabilities into these models, achieving promising results. However, training these models…

Artificial Intelligence · Computer Science 2026-02-03 Shidong Yang , Tongwen Huang , Hao Wen , Yong Wang , Li Chen , Xiangxiang Chu

Language prediction is constrained by informational entropy intrinsic to language, such that there exists a limit to how accurate any language model can become and equivalently a lower bound to language compression. The most efficient…

Computation and Language · Computer Science 2025-11-14 Benjamin L. Badger , Matthew Neligeorge

Large Reasoning Models (LRMs) perform strongly in complex reasoning tasks via Chain-of-Thought (CoT) prompting, but often suffer from verbose outputs, increasing computational overhead. Existing fine-tuning-based compression methods either…

Machine Learning · Computer Science 2025-09-22 Ziqing Qiao , Yongheng Deng , Jiali Zeng , Dong Wang , Lai Wei , Guanbo Wang , Fandong Meng , Jie Zhou , Ju Ren , Yaoxue Zhang

This paper aims to overcome a major obstacle in scaling RL for reasoning with LLMs, namely the collapse of policy entropy. Such phenomenon is consistently observed across vast RL runs without entropy intervention, where the policy entropy…

Reasoning models often outperform smaller models but at 3--5$\times$ higher cost and added latency. We present entropy-guided refinement: a lightweight, test-time loop that uses token-level uncertainty to trigger a single, targeted…

Artificial Intelligence · Computer Science 2025-09-03 Andrew G. A. Correa , Ana C. H de Matos

Chain-of-Thought (CoT) reasoning successfully enhances the reasoning capabilities of Large Language Models (LLMs), yet it incurs substantial computational overhead for inference. Existing CoT compression methods often suffer from a critical…

Machine Learning · Computer Science 2026-05-26 Yuntian Tang , Bohan Jia , Wenxuan Huang , Lianyue Zhang , Jiao Xie , Wenxi Li , Wei Li , Jie Hu , Xinghao Chen Rongrong Ji , Shaohui Lin

Large reasoning models (LRMs) achieve strong performance via extended chain-of-thought (CoT) reasoning, yet suffer from excessive token consumption and high inference latency. Existing reinforcement learning (RL) approaches for CoT…

Machine Learning · Computer Science 2026-05-19 Tingcheng Bian , Yuzhe Zhang , Jing Jin , Jinchang Luo , MingQuan Cheng , Haiwei Wang , Wenyuan Jiang , Miaohui Wang

Large Reasoning Models (LRMs) have demonstrated impressive capabilities but suffer from cognitive inefficiencies like "overthinking" simple problems and "underthinking" complex ones. While existing methods that use supervised fine-tuning…

Artificial Intelligence · Computer Science 2026-03-24 Tian Liang , Wenxiang Jiao , Zhiwei He , Jiahao Xu , Haitao Mi , Dong Yu

Making decisions freely presupposes that there is some indeterminacy in the environment and in the decision making engine. The former is reflected on the behavioral changes due to communicating: few changes indicate rigid environments;…

Artificial Intelligence · Computer Science 2020-09-23 Luis A. Pineda

Large reasoning models (LRMs) typically solve reasoning-intensive tasks by generating long chain-of-thought (CoT) traces, leading to substantial inference overhead. We identify a reproducible inference-time phenomenon, termed…

Computation and Language · Computer Science 2026-02-03 Jie Deng , Shining Liang , Jun Li , Hongzhi Li , Yutao Xie

Recent advancements in large language models (LLMs) often rely on generating intermediate reasoning steps to enhance accuracy. However, little work has examined how reasoning utility contributes to the final answer's correctness. Due to the…

Computation and Language · Computer Science 2025-08-29 Xu Guo

We introduce a simple, yet novel entropy-based framework to drive token efficiency in large language models during reasoning tasks. Our approach uses Shannon entropy from token-level logprobs as a confidence signal to enable early stopping,…

Machine Learning · Computer Science 2025-10-29 Aman Sharma , Paras Chopra

Large Reasoning Models (LRMs) have achieved impressive performance on complex reasoning tasks by generating detailed chain-of-thought (CoT) explanations. However, these responses are often excessively long, containing redundant reasoning…

Artificial Intelligence · Computer Science 2025-10-13 Chen Huang , Wei Lu , Wenxuan Zhang

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

‹ Prev 1 2 3 10 Next ›