Related papers: Piculet: Specialized Models-Guided Hallucination D…

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model

Multi-modal large language models (MLLMs) have been shown to efficiently integrate natural language with visual information to handle multi-modal tasks. However, MLLMs still face a fundamental limitation of hallucinations, where they tend…

Computer Vision and Pattern Recognition · Computer Science 2024-02-27 Chaoya Jiang , Haiyang Xu , Mengfan Dong , Jiaxing Chen , Wei Ye , Ming Yan , Qinghao Ye , Ji Zhang , Fei Huang , Shikun Zhang

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs

Existing Large Vision-Language Models (LVLMs) primarily align image features of vision encoder with Large Language Models (LLMs) to leverage their superior text generation capabilities. However, the scale disparity between vision encoder…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Shi Liu , Kecheng Zheng , Wei Chen

A Survey of Hallucination in Large Visual Language Models

The Large Visual Language Models (LVLMs) enhances user interaction and enriches user experience by integrating visual modality on the basis of the Large Language Models (LLMs). It has demonstrated their powerful information processing and…

Artificial Intelligence · Computer Science 2024-10-22 Wei Lan , Wenyi Chen , Qingfeng Chen , Shirui Pan , Huiyu Zhou , Yi Pan

Hallucination of Multimodal Large Language Models: A Survey

This survey presents a comprehensive analysis of the phenomenon of hallucination in multimodal large language models (MLLMs), also known as Large Vision-Language Models (LVLMs), which have demonstrated significant advancements and…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Zechen Bai , Pichao Wang , Tianjun Xiao , Tong He , Zongbo Han , Zheng Zhang , Mike Zheng Shou

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

Multimodal large language models (MLLMs) have achieved strong performance on vision-language tasks but still struggle with fine-grained visual differences, leading to hallucinations or missed semantic shifts. We attribute this to…

Computer Vision and Pattern Recognition · Computer Science 2025-06-10 Tianyi Bai , Yuxuan Fan , Jiantao Qiu , Fupeng Sun , Jiayi Song , Junlin Han , Zichen Liu , Conghui He , Wentao Zhang , Binhang Yuan

FiVL: A Framework for Improved Vision-Language Alignment through the Lens of Training, Evaluation and Explainability

Large Vision Language Models (LVLMs) have achieved significant progress in integrating visual and textual inputs for multimodal reasoning. However, a recurring challenge is ensuring these models utilize visual information as effectively as…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Estelle Aflalo , Gabriela Ben Melech Stan , Tiep Le , Man Luo , Shachar Rosenman , Sayak Paul , Shao-Yen Tseng , Vasudev Lal

Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning

Recent advancements in multimodal large language models (MLLMs) have shown unprecedented capabilities in advancing various vision-language tasks. However, MLLMs face significant challenges with hallucinations, and misleading outputs that do…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Shengqiong Wu , Hao Fei , Liangming Pan , William Yang Wang , Shuicheng Yan , Tat-Seng Chua

Causal-LLaVA: Causal Disentanglement for Mitigating Hallucination in Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have demonstrated strong performance in visual understanding tasks, yet they often suffer from object hallucinations--generating descriptions of objects that are inconsistent with or entirely absent…

Artificial Intelligence · Computer Science 2025-05-27 Xinmiao Hu , Chun Wang , Ruihe An , ChenYu Shao , Xiaojun Ye , Sheng Zhou , Liangcheng Li

Preemptive Hallucination Reduction: An Input-Level Approach for Multimodal Language Model

Visual hallucinations in Large Language Models (LLMs), where the model generates responses that are inconsistent with the visual input, pose a significant challenge to their reliability, particularly in contexts where precise and…

Computer Vision and Pattern Recognition · Computer Science 2025-06-30 Nokimul Hasan Arif , Shadman Rabby , Md Hefzul Hossain Papon , Sabbir Ahmed

IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding

Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified…

Computer Vision and Pattern Recognition · Computer Science 2024-02-29 Lanyun Zhu , Deyi Ji , Tianrun Chen , Peng Xu , Jieping Ye , Jun Liu

Watch Closely: Mitigating Object Hallucinations in Large Vision-Language Models with Disentangled Decoding

Large Vision-Language Models (LVLMs) bridge the gap between visual and linguistic modalities, demonstrating strong potential across a variety of domains. However, despite significant progress, LVLMs still suffer from severe hallucination…

Computer Vision and Pattern Recognition · Computer Science 2025-12-23 Ruiqi Ma , Yu Yan , Chunhong Zhang , Minghao Yin , XinChao Liu , Zhihong Jin , Zheng Hu

When Images Speak Louder: Mitigating Language Bias-induced Hallucinations in VLMs through Cross-Modal Guidance

Vision-Language Models (VLMs) have shown solid ability for multimodal understanding of both visual and language contexts. However, existing VLMs often face severe challenges of hallucinations, meaning that VLMs tend to generate responses…

Computer Vision and Pattern Recognition · Computer Science 2025-10-14 Jinjin Cao , Zhiyang Chen , Zijun Wang , Liyuan Ma , Weijian Luo , Guojun Qi

ResNetVLLM-2: Addressing ResNetVLLM's Multi-Modal Hallucinations

Large Language Models (LLMs) have transformed natural language processing (NLP) tasks, but they suffer from hallucination, generating plausible yet factually incorrect content. This issue extends to Video-Language Models (VideoLLMs), where…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Ahmad Khalil , Mahmoud Khalil , Alioune Ngom

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating…

Computer Vision and Pattern Recognition · Computer Science 2024-06-03 Kai Wu , Boyuan Jiang , Zhengkai Jiang , Qingdong He , Donghao Luo , Shengzhi Wang , Qingwen Liu , Chengjie Wang

Exploring Causes and Mitigation of Hallucinations in Large Vision Language Models

Large Vision-Language Models (LVLMs) integrate image encoders with Large Language Models (LLMs) to process multi-modal inputs and perform complex visual tasks. However, they often generate hallucinations by describing non-existent objects…

Computer Vision and Pattern Recognition · Computer Science 2025-02-25 Yaqi Sun , Kyohei Atarashi , Koh Takeuchi , Hisashi Kashima

MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM

Multimodal hallucination in multimodal large language models (MLLMs) restricts the correctness of MLLMs. However, multimodal hallucinations are multi-sourced and arise from diverse causes. Existing benchmarks fail to adequately distinguish…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Bowen Dong , Minheng Ni , Zitong Huang , Guanglei Yang , Wangmeng Zuo , Lei Zhang

Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination

Multi-modal Large Language Models (MLLMs) demonstrate remarkable success across various vision-language tasks. However, they suffer from visual hallucination, where the generated responses diverge from the provided image. Are MLLMs…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Dingchen Yang , Bowen Cao , Guang Chen , Changjun Jiang

See Different, Think Better: Visual Variations Mitigating Hallucinations in LVLMs

Large Vision-Language Models (LVLMs) have demonstrated remarkable capabilities in visual understanding and multimodal reasoning. However, LVLMs frequently exhibit hallucination phenomena, manifesting as the generated textual responses that…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Ziyun Dai , Xiaoqiang Li , Shaohua Zhang , Yuanchen Wu , Jide Li

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Hallucination has been a major problem for large language models and remains a critical challenge when it comes to multimodality in which vision-language models (VLMs) have to deal with not just textual but also visual inputs. Despite rapid…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Zhecan Wang , Garrett Bingham , Adams Yu , Quoc Le , Thang Luong , Golnaz Ghiasi

Beyond Fine-Tuning: Effective Strategies for Mitigating Hallucinations in Large Language Models for Data Analytics

Large Language Models (LLMs) have become increasingly important in natural language processing, enabling advanced data analytics through natural language queries. However, these models often generate "hallucinations"-inaccurate or…

Computation and Language · Computer Science 2024-10-29 Mikhail Rumiantsau , Aliaksei Vertsel , Ilya Hrytsuk , Isaiah Ballah