English
Related papers

Related papers: Entropy-Tree: Tree-Based Decoding with Entropy-Gui…

200 papers

Large language models (LLMs) achieve remarkable generative performance, yet their output quality is dependent on the decoding strategy. While sampling-based methods (e.g., top-k, nucleus) and search-and-select based methods (e.g., beam…

Machine Learning · Computer Science 2026-05-12 Benjamin Patrick Evans , Sumitra Ganesh , Leo Ardon

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

Multi-step processes via large language models (LLMs) have proven effective for solving complex reasoning tasks. However, the depth of exploration of the reasoning procedure can significantly affect the task performance. Existing methods to…

Artificial Intelligence · Computer Science 2025-06-19 Jinghan Zhang , Xiting Wang , Fengran Mo , Yeyang Zhou , Wanfu Gao , Kunpeng Liu

Neural networks with tree-based sentence encoders have shown better results on many downstream tasks. Most of existing tree-based encoders adopt syntactic parsing trees as the explicit structure prior. To study the effectiveness of…

Computation and Language · Computer Science 2018-08-30 Haoyue Shi , Hao Zhou , Jiaze Chen , Lei Li

Using Machine Learning systems in the real world can often be problematic, with inexplicable black-box models, the assumed certainty of imperfect measurements, or providing a single classification instead of a probability distribution. This…

Machine Learning · Computer Science 2023-07-11 Jonathan S. Kent , David H. Menager

Chain-of-Thought (CoT) prompting has significantly enhanced the mathematical reasoning capabilities of Large Language Models. We find existing fine-tuning datasets frequently suffer from the "answer right but reasoning wrong" probelm, where…

Artificial Intelligence · Computer Science 2026-01-13 Zihang Li , Yuhang Wang , Yikun Zong , Wenhan Yu , Xiaokun Yuan , Runhan Jiang , Zirui Liu , Tong Yang , Arthur Jiang

Chain-of-thought (CoT) reasoning improves large language model performance on complex tasks, but often produces excessively long and inefficient reasoning traces. Existing methods shorten CoTs using length penalties or global entropy…

Artificial Intelligence · Computer Science 2026-04-08 Xuan Xiong , Huan Liu , Li Gu , Zhixiang Chi , Yue Qiu , Yuanhao Yu , Yang Wang

Measuring the complexity of tree structures can be beneficial in areas that use tree data structures for storage, communication, and processing purposes. This complexity can then be used to compress tree data structures to their…

Information Theory · Computer Science 2023-09-19 Amirmohammad Farzaneh , Mihai-Alin Badiu , Justin P. Coon

Autoregressive language models demonstrate excellent performance in various scenarios. However, the inference efficiency is limited by its one-step-one-word generation mode, which has become a pressing problem recently as the models become…

Computation and Language · Computer Science 2025-04-25 Jikai Wang , Yi Su , Juntao Li , Qingrong Xia , Zi Ye , Xinyu Duan , Zhefeng Wang , Min Zhang

Augmenting Large Language Models (LLMs) with retrieved external knowledge has proven effective for improving the factual accuracy of generated responses. Despite their success, retrieval-augmented LLMs still face the distractibility issue,…

Computation and Language · Computer Science 2025-02-18 Zexuan Qiu , Zijing Ou , Bin Wu , Jingjing Li , Aiwei Liu , Irwin King

With recent advancements in large language models, methods like chain-of-thought prompting to elicit reasoning chains have been shown to improve results on reasoning tasks. However, tasks that require multiple steps of reasoning still pose…

Computation and Language · Computer Science 2023-12-13 Olga Golovneva , Sean O'Brien , Ramakanth Pasunuru , Tianlu Wang , Luke Zettlemoyer , Maryam Fazel-Zarandi , Asli Celikyilmaz

Recent advances in large language models (LLMs) have shown that test-time scaling can substantially improve model performance on complex tasks, particularly in the coding domain. Under this paradigm, models use a larger token budget during…

Artificial Intelligence · Computer Science 2026-04-21 Jiaxin Fang , Runyuan He , Sahil Bhatia , Neel Gajare , Alvin Cheung

It has been demonstrated that excitable media with a tree structure performed better than other network topologies, it is natural to consider neural networks defined on Cayley trees. The investigation of a symbolic space called tree-shift…

Dynamical Systems · Mathematics 2018-02-28 Jung-Chao Ban , Chih-Hung Chang , Nai-Zhu Huang

Balancing exploration and exploitation is a central goal in reinforcement learning (RL). Despite recent advances in enhancing large language model (LLM) reasoning, most methods lean toward exploitation, and increasingly encounter…

Computation and Language · Computer Science 2025-11-11 Daixuan Cheng , Shaohan Huang , Xuekai Zhu , Bo Dai , Wayne Xin Zhao , Zhenliang Zhang , Furu Wei

Neural networks have dramatically increased our capacity to learn from large, high-dimensional datasets across innumerable disciplines. However, their decisions are not easily interpretable, their computational costs are high, and building…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Mackenzie J. Meni , Ryan T. White , Michael Mayo , Kevin Pilkiewicz

Rooted trees with probabilities are used to analyze properties of a variable length code. A bound is derived on the difference between the entropy rates of the code and a memoryless source. The bound is in terms of normalized informational…

Information Theory · Computer Science 2013-10-11 Georg Böcherer , Rana Ali Amjad

The definition of $k^{th}$-order empirical entropy of strings is extended to node labelled binary trees. A suitable binary encoding of tree straight-line programs (that have been used for grammar-based tree compression before) is shown to…

Data Structures and Algorithms · Computer Science 2020-05-21 Danny Hucke , Markus Lohrey , Louisa Seelbach Benkner

Language models generate reasoning sequentially, preventing them from decoupling irrelevant exploration paths during search. We introduce Tree-Structured Language Modeling (TSLM), which uses special tokens to encode branching structure,…

Computation and Language · Computer Science 2026-02-02 Doyoung Kim , Jaehyeok Doo , Minjoon Seo

Large reasoning models have demonstrated remarkable performance on complex reasoning tasks, yet the excessive length of their chain-of-thought outputs remains a major practical bottleneck due to high computation cost and poor deployability.…

Computation and Language · Computer Science 2025-11-25 Hourun Zhu , Yang Gao , Wenlong Fei , Jiawei Li , Huashan Sun

Language prediction is constrained by informational entropy intrinsic to language, such that there exists a limit to how accurate any language model can become and equivalently a lower bound to language compression. The most efficient…

Computation and Language · Computer Science 2025-11-14 Benjamin L. Badger , Matthew Neligeorge
‹ Prev 1 2 3 10 Next ›