English
Related papers

Related papers: Entropy-Based Decoding for Retrieval-Augmented Lar…

200 papers

Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve…

Computation and Language · Computer Science 2024-04-16 Souvik Das , Lifeng Jin , Linfeng Song , Haitao Mi , Baolin Peng , Dong Yu

Large language models (LLMs) tend to inadequately integrate input context during text generation, relying excessively on encoded prior knowledge in model parameters, potentially resulting in generated text with factual inconsistencies or…

Computation and Language · Computer Science 2024-05-07 Zheng Zhao , Emilio Monti , Jens Lehmann , Haytham Assem

Large Vision-Language Models (LVLMs) have demonstrated remarkable multimodal capabilities, but they inherit the tendency to hallucinate from their underlying language models. While visual contrastive decoding has been proposed to mitigate…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Eun Woo Im , Muhammad Kashif Ali , Vivek Gupta

Decoding strategies play a central role in shaping the reasoning ability of large language models (LLMs). Traditional methods such as greedy decoding and beam search often suffer from error propagation, while sampling-based approaches…

Large language models (LLMs) achieve remarkable generative performance, yet their output quality is dependent on the decoding strategy. While sampling-based methods (e.g., top-k, nucleus) and search-and-select based methods (e.g., beam…

Machine Learning · Computer Science 2026-05-12 Benjamin Patrick Evans , Sumitra Ganesh , Leo Ardon

Modern large language model (LLM) inference engines optimize throughput and latency under fixed decoding rules, treating generation as a linear progression in token time. We propose a fundamentally different paradigm: entropic\-time…

Computation and Language · Computer Science 2026-03-05 Andrew Kiruluta

Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads…

Computation and Language · Computer Science 2024-10-25 Aryo Pradipta Gema , Chen Jin , Ahmed Abdulaal , Tom Diethe , Philip Teare , Beatrice Alex , Pasquale Minervini , Amrutha Saseendran

Large Language Models (LLMs) are known to hallucinate, whereby they generate plausible but inaccurate text. This phenomenon poses significant risks in critical applications, such as medicine or law, necessitating robust hallucination…

Computation and Language · Computer Science 2024-10-23 Benedict Aaron Tjandra , Muhammed Razzak , Jannik Kossen , Kunal Handa , Yarin Gal

Uncertainty decomposition refers to the task of decomposing the total uncertainty of a predictive model into aleatoric (data) uncertainty, resulting from inherent randomness in the data-generating process, and epistemic (model) uncertainty,…

Computation and Language · Computer Science 2024-06-12 Bairu Hou , Yujian Liu , Kaizhi Qian , Jacob Andreas , Shiyu Chang , Yang Zhang

When using large language models (LLMs) in knowledge-intensive tasks, such as open-domain question answering, external context can bridge the gap between external knowledge and the LLMs' parametric knowledge. Recent research has been…

Computation and Language · Computer Science 2024-10-08 Youna Kim , Hyuhng Joon Kim , Cheonbok Park , Choonghyun Park , Hyunsoo Cho , Junyeob Kim , Kang Min Yoo , Sang-goo Lee , Taeuk Kim

Diffusion language models (DLMs) are promising alternatives to autoregressive language models (ARMs), yet the intrinsic differences in their generated text remain underexplored. We first find empirically that off-the-shelf DLMs exhibit…

Computation and Language · Computer Science 2026-05-14 Zeyang Zhang , Chengwei Liang , Xingyan Chen , Meiqi Gu , Minrui Luo , Jingzhao Zhang , Tianxing He

Language models (LMs) are trained on billions of tokens in an attempt to recover the true language distribution. Still, vanilla random sampling from LMs yields low quality generations. Decoding algorithms attempt to restrict the LM…

Machine Learning · Computer Science 2026-01-06 Kareem Ahmed , Sameer Singh

Text-image alignment constitutes a foundational challenge in multimedia content understanding, where effective modeling of cross-modal semantic correspondences critically enhances retrieval system performance through joint embedding space…

Computer Vision and Pattern Recognition · Computer Science 2025-10-16 Rongjun Chen , Chengsi Yao , Jinchang Ren , Xianxian Zeng , Peixian Wang , Jun Yuan , Jiawen Li , Huimin Zhao , Xu Lu

Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility…

Machine Learning · Computer Science 2026-03-24 Changxiao Cai , Gen Li

Large Language Models (LLMs)-based text retrieval retrieves documents relevant to search queries based on vector similarities. Documents are pre-encoded offline, while queries arrive in real-time, necessitating an efficient online query…

Information Retrieval · Computer Science 2026-02-02 Guangyuan Ma , Yongliang Ma , Xuanrui Gou , Zhenpeng Su , Ming Zhou , Songlin Hu

Retrieval-augmented generation improves the factual accuracy of Large Language Models (LLMs) by incorporating external context, but often suffers from irrelevant retrieved content that hinders effectiveness. Context compression addresses…

Computation and Language · Computer Science 2025-09-23 Lvzhou Luo , Yixuan Cao , Ping Luo

The objective of drug discovery is to identify chemical compounds that possess specific pharmaceutical properties toward a binding target. Existing large language models (LLMS) can achieve high token matching scores in terms of likelihood…

Machine Learning · Computer Science 2025-04-01 Xuefeng Liu , Chih-chan Tien , Peng Ding , Songhao Jiang , Rick L. Stevens

The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information. However, the presence of noise and…

Computation and Language · Computer Science 2024-09-19 Xingyun Hong , Yan Shao , Zhilin Wang , Manni Duan , Jin Xiongnan

This chapter explores advancements in decoding strategies for large language models (LLMs), focusing on enhancing the Locally Typical Sampling (LTS) algorithm. Traditional decoding methods, such as top-k and nucleus sampling, often struggle…

Computation and Language · Computer Science 2025-06-12 Jaydip Sen , Saptarshi Sengupta , Subhasis Dasgupta

The remarkable performance of Large language models (LLMs) relies heavily on the availability of abundant high-quality training data. However, the high cost of acquiring annotated data often prevents models from obtaining capabilities to…

Computation and Language · Computer Science 2025-06-05 Mingxu Tao , Jie Hu , Mingchuan Yang , Yunhuai Liu , Dongyan Zhao , Yansong Feng
‹ Prev 1 2 3 10 Next ›