English
Related papers

Related papers: LayerCake: Token-Aware Contrastive Decoding within…

200 papers

Large language models (LLMs) tend to inadequately integrate input context during text generation, relying excessively on encoded prior knowledge in model parameters, potentially resulting in generated text with factual inconsistencies or…

Computation and Language · Computer Science 2024-05-07 Zheng Zhao , Emilio Monti , Jens Lehmann , Haytham Assem

Recent decoding methods improve the factuality of large language models (LLMs) by refining how the next token is selected during generation. These methods typically operate at the token level, leveraging internal representations to suppress…

Computation and Language · Computer Science 2025-09-16 Hongxiang Zhang , Hao Chen , Muhao Chen , Tianyi Zhang

Despite their impressive capabilities, large language models (LLMs) are prone to hallucinations, i.e., generating content that deviates from facts seen during pretraining. We propose a simple decoding strategy for reducing hallucinations…

Computation and Language · Computer Science 2024-03-12 Yung-Sung Chuang , Yujia Xie , Hongyin Luo , Yoon Kim , James Glass , Pengcheng He

Despite their impressive capacities, Large language models (LLMs) often struggle with the hallucination issue of generating inaccurate or fabricated content even when they possess correct knowledge. In this paper, we extend the exploration…

Computation and Language · Computer Science 2025-02-06 Jialiang Wu , Yi Shen , Sijia Liu , Yi Tang , Sen Song , Xiaoyi Wang , Longjun Cai

Large language models (LLMs) exhibit impressive natural language capabilities but suffer from hallucination -- generating content ungrounded in the realities of training data. Recent work has focused on decoding techniques to improve…

Computation and Language · Computer Science 2024-04-16 Souvik Das , Lifeng Jin , Linfeng Song , Haitao Mi , Baolin Peng , Dong Yu

Large Language Models (LLMs) have demonstrated exceptional performance across various natural language processing tasks. However, they occasionally generate inaccurate and counterfactual outputs, a phenomenon commonly referred to as…

Computation and Language · Computer Science 2025-06-04 Dingwei Chen , Feiteng Fang , Shiwen Ni , Feng Liang , Xiping Hu , Ahmadreza Argha , Hamid Alinejad-Rokny , Min Yang , Chengming Li

Large Language Models (LLMs) based on autoregressive, decoder-only Transformers generate text one token at a time, where a token represents a discrete unit of text. As each newly produced token is appended to the partial output sequence,…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-06 Dimitrios Kafetzis , Ramin Khalili , Iordanis Koutsopoulos

Faithful generation in large language models (LLMs) is challenged by knowledge conflicts between parametric memory and external context. Existing contrastive decoding methods tuned specifically to handle conflict often lack adaptability and…

Computation and Language · Computer Science 2025-08-28 Anant Khandelwal , Manish Gupta , Puneet Agrawal

Multimodal Large Language Models (MLLMs) are increasingly deployed in real-world applications, yet their ability to make context-aware safety decisions remains limited. Existing methods often fail to balance oversensitivity (unjustified…

Computation and Language · Computer Science 2025-09-24 Zheyuan Liu , Zhangchen Xu , Guangyao Dou , Xiangchi Yuan , Zhaoxuan Tan , Radha Poovendran , Meng Jiang

Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of factual knowledge. However, understanding their underlying reasoning and internal mechanisms in exploiting this knowledge remains a key research area.…

Computation and Language · Computer Science 2024-08-07 Marco Bronzini , Carlo Nicolini , Bruno Lepri , Jacopo Staiano , Andrea Passerini

We propose a method to teach multiple large language models (LLM) to collaborate by interleaving their generations at the token level. We model the decision of which LLM generates the next token as a latent variable. By optimizing the…

Computation and Language · Computer Science 2024-08-28 Shannon Zejiang Shen , Hunter Lang , Bailin Wang , Yoon Kim , David Sontag

Large Language Models (LLMs) are widely deployed in real-world applications, yet little is known about their training dynamics at the token level. Evaluation typically relies on aggregated training loss, measured at the batch level, which…

Computation and Language · Computer Science 2024-10-17 Andrea Pinto , Tomer Galanti , Randall Balestriero

Large Language Models (LLMs) demonstrate exceptional performance across diverse tasks by leveraging pre-trained (i.e., parametric) and external (i.e., contextual) knowledge. While substantial efforts have been made to enhance the…

Computation and Language · Computer Science 2025-05-19 Hyuhng Joon Kim , Youna Kim , Sang-goo Lee , Taeuk Kim

Large language models (LLMs) have emerged as strong contenders in machine translation.Yet, they still struggle to adequately handle discourse phenomena, such as pronoun resolution and lexical cohesion at the document level. In this study,…

Computation and Language · Computer Science 2025-10-09 Wafaa Mohammed , Vlad Niculae , Chrysoula Zerva

Logical reasoning is a pivotal component in the field of artificial intelligence. Proof planning, particularly in contexts requiring the validation of explanation accuracy, continues to present challenges. The recent advancement of large…

Computation and Language · Computer Science 2025-10-31 Ying Su , Mingwen Liu , Zhijiang Guo

Diffusion Large Language Models (dLLMs) have demonstrated promising generative capabilities and are increasingly used to produce formal languages defined by context-free grammars, such as source code and chemical expressions. However, as…

Computation and Language · Computer Science 2026-02-10 Yitong Zhang , Yongmin Li , Yuetong Liu , Jia Li , Xiaoran Jia , Zherui Li , Ge Li

Large Vision Language Models (LVLMs) have demonstrated remarkable capabilities in understanding and describing visual content, achieving state-of-the-art performance across various vision-language tasks. However, these models often generate…

Computer Vision and Pattern Recognition · Computer Science 2025-03-27 Kazi Hasan Ibn Arif , Sajib Acharjee Dip , Khizar Hussain , Lang Zhang , Chris Thomas

The most common training pipeline for large language models includes pretraining, finetuning and aligning phases, with their respective resulting models, such as the pretrained model and the finetuned model. Finetuned and aligned models…

Computation and Language · Computer Science 2024-02-29 Lifeng Jin , Baolin Peng , Linfeng Song , Haitao Mi , Ye Tian , Dong Yu

Large language models (LLMs) have demonstrated significant improvements in contextual understanding. However, their ability to attend to truly critical information during long-context reasoning and generation still falls behind the pace.…

Computation and Language · Computer Science 2025-10-27 Yiju Guo , Wenkai Yang , Zexu Sun , Ning Ding , Zhiyuan Liu , Yankai Lin

Large Vision-Language Models (LVLMs) have recently shown promising results on various multimodal tasks, even achieving human-comparable performance in certain cases. Nevertheless, LVLMs remain prone to hallucinations -- they often rely…

Computer Vision and Pattern Recognition · Computer Science 2025-10-17 Kyungryul Back , Seongbeom Park , Milim Kim , Mincheol Kwon , SangHyeok Lee , Hyunyoung Lee , Junhee Cho , Seunghyun Park , Jinkyu Kim
‹ Prev 1 2 3 10 Next ›