中文
相关论文

相关论文: Learning Adaptive LLM Decoding

200 篇论文

Decoding strategies largely determine the quality of Large Language Model (LLM) outputs, yet widely used heuristics such as greedy or fixed temperature/top-p decoding are static and often task-agnostic, leading to suboptimal or inconsistent…

计算与语言 · 计算机科学 2026-03-20 Asmita Bhardwaj , Yuya Jeremy Ong , Eelaaf Zahid , Basel Shbita

Recently, Large Language Models (LLMs) have shown impressive abilities in code generation. However, existing LLMs' decoding strategies are designed for Natural Language (NL) generation, overlooking the differences between NL and programming…

软件工程 · 计算机科学 2023-12-29 Yuqi Zhu , Jia Li , Ge Li , YunFei Zhao , Jia Li , Zhi Jin , Hong Mei

Computationally intensive decoding procedures--including search, reranking, and self-critique--can improve the quality of language model (LM) outputs in problems spanning code generation, numerical reasoning, and dialog. Existing work…

机器学习 · 计算机科学 2024-10-08 Mehul Damani , Idan Shenfeld , Andi Peng , Andreea Bobu , Jacob Andreas

Speculative decoding accelerates large language model (LLM) inference by using a small draft model to generate candidate tokens for a larger target model to verify. The efficacy of this technique hinges on the trade-off between the time…

计算与语言 · 计算机科学 2026-03-03 Jiebin Zhang , Zhenghan Yu , Liang Wang , Nan Yang , Eugene J. Yu , Zheng Li , Yifan Song , Dawei Zhu , Xingxing Zhang , Furu Wei , Sujian Li

During language model decoding, it is known that using higher temperature sampling gives more creative responses, while lower temperatures are more factually accurate. However, such models are commonly applied to general instruction…

Large language model (LLM) decoding involves generating a sequence of tokens based on a given context, where each token is predicted one at a time using the model's learned probabilities. The typical autoregressive decoding method requires…

计算与语言 · 计算机科学 2024-08-20 Xukun Liu , Bowen Lei , Ruqi Zhang , Dongkuan Xu

Large Language Models (LLMs) have demonstrated exceptional abilities across a broad range of language-related tasks, including generating solutions to complex reasoning problems. An effective technique to enhance LLM performance is…

计算与语言 · 计算机科学 2024-12-25 Shuzhang Cai , Twumasi Mensah-Boateng , Xander Kuksov , Jing Yuan , Shaojie Tang

Large language models (LLMs) have achieved remarkable performance across a wide range of tasks, but their increasing parameter sizes significantly slow down inference. Speculative decoding mitigates this issue by leveraging a smaller draft…

计算与语言 · 计算机科学 2026-05-27 Kuan-Wei Lu , Ding-Yong Hong , Pangfeng Liu , Jan-Jan Wu

Generally, the decoder-only large language models (LLMs) are adapted to context-aware neural machine translation (NMT) in a concatenating way, where LLMs take the concatenation of the source sentence (i.e., intra-sentence context) and the…

计算与语言 · 计算机科学 2024-09-24 Xinglin Lyu , Junhui Li , Yanqing Zhao , Min Zhang , Daimeng Wei , Shimin Tao , Hao Yang , Min Zhang

Despite their widespread adoption, large language models (LLMs) remain prohibitive to use under resource constraints, with their ever growing sizes only increasing the barrier for use. One noted issue is the high latency associated with…

机器学习 · 计算机科学 2024-12-17 Jerry Huang , Prasanna Parthasarathi , Mehdi Rezagholizadeh , Sarath Chandar

Large Language Models (LLMs) have shown outstanding performance across a variety of tasks, partly due to advanced prompting techniques. However, these techniques often require lengthy prompts, which increase computational costs and can…

计算与语言 · 计算机科学 2025-04-16 Jinwu Hu , Wei Zhang , Yufeng Wang , Yu Hu , Bin Xiao , Mingkui Tan , Qing Du

Recent reasoning Large Language Models (LLMs) demonstrate remarkable problem-solving abilities but often generate long thinking traces whose utility is unclear. Our work aims to improve their efficiency, enabling them to reach high…

计算与语言 · 计算机科学 2026-05-11 Xiang Liu , Xuming Hu , Xiaowen Chu , Eunsol Choi

Autoregressive decoding in large language models (LLMs) requires $\mathcal{O}(n)$ sequential steps for $n$ tokens, fundamentally limiting inference throughput. Recent diffusion-based LLMs (dLLMs) enable parallel token generation through…

计算与语言 · 计算机科学 2025-10-06 Wenrui Bao , Zhiben Chen , Dan Xu , Yuzhang Shang

In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded…

计算与语言 · 计算机科学 2026-04-16 Andresa Rodrigues de Campos , David Lee , Imry Kissos , Piyush Paritosh

Pretrained large language models (LLMs) are increasingly utilized across a wide range of natural language processing (NLP) tasks due to their impressive capabilities as few-shot learners. Recent techniques, such as chain-of-thought (CoT)…

机器学习 · 计算机科学 2024-12-02 Kamesh R

Recent advances in large language models (LLMs) have made reasoning a central benchmark for evaluating intelligence. While prior surveys focus on efficiency by examining how to shorten reasoning chains or reduce computation, this view…

人工智能 · 计算机科学 2026-04-01 Chao Wu , Baoheng Li , Mingchen Gao , Yu Tian , Zhenyi Wang

To bridge the gap between vision and language modalities, Multimodal Large Language Models (MLLMs) usually learn an adapter that converts visual inputs to understandable tokens for Large Language Models (LLMs). However, most adapters…

计算机视觉与模式识别 · 计算机科学 2024-05-27 Yue Zhang , Hehe Fan , Yi Yang

Large Language Models (LLMs) are increasingly applied to complex tasks that require extended reasoning. In such settings, models often benefit from diverse chains-of-thought to arrive at multiple candidate solutions. This requires two…

机器学习 · 计算机科学 2025-10-08 Xueyan Li , Guinan Su , Mrinmaya Sachan , Jonas Geiping

Reasoning Large Language Models (LLMs) enable test-time scaling, with dataset-level accuracy improving as the token budget increases, motivating adaptive reasoning -- spending tokens when they improve reliability and stopping early when…

Detectability of failures of linear programming (LP) decoding and its potential for improvement by adding new constraints motivate the use of an adaptive approach in selecting the constraints for the LP problem. In this paper, we make a…

信息论 · 计算机科学 2007-07-13 Mohammad H. Taghavi N. , Paul H. Siegel
‹ 上一页 1 2 3 10 下一页 ›