English
Related papers

Related papers: ReasonEdit: Editing Vision-Language Models using H…

200 papers

Editing complex visual content from ambiguous or partially specified instructions remains a core challenge in vision-language modeling. Existing models can contextualize content but often fail to infer the underlying intent within a…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Umar Khalid , Kashif Munir , Hasan Iqbal , Azib Farooq , Jing Hua , Nazanin Rahnavard , Chen Chen , Victor Zhu , Zhengping Ji

Model editing aims to correct outdated or erroneous knowledge in large models without costly retraining. Recent research discovered that the mid-layer representation of the subject's final token in a prompt has a strong influence on factual…

Computer Vision and Pattern Recognition · Computer Science 2025-01-24 Qizhou Chen , Taolin Zhang , Chengyu Wang , Xiaofeng He , Dakan Wang , Tingting Liu

Model editing aims to correct inaccurate knowledge, update outdated information, and incorporate new data into Large Language Models (LLMs) without the need for retraining. This task poses challenges in lifelong scenarios where edits must…

Computation and Language · Computer Science 2025-03-17 Qizhou Chen , Chengyu Wang , Dakan Wang , Taolin Zhang , Wangyue Li , Xiaofeng He

Recent text-guided image editing (TIE) models have achieved remarkable progress, however, many edited results still suffer from artifacts, unintended modifications, and suboptimal aesthetics. Although several benchmarks and evaluation…

Computer Vision and Pattern Recognition · Computer Science 2026-05-11 Honghua Chen , Zitong Xu , Huiyu Duan , Xinyun Zhang , Xiongkuo Min , Guangtao Zhai

Model editing aims to efficiently update a pre-trained model's knowledge without the need for time-consuming full retraining. While existing pioneering editing methods achieve promising results, they primarily focus on editing single-modal…

Computer Vision and Pattern Recognition · Computer Science 2025-09-22 Zhiyi Shi , Binjie Wang , Chongjie Si , Yichen Wu , Junsik Kim , Hanspeter Pfister

Recent advances in image editing models have shown remarkable progress. A common architectural design couples a multimodal large language model (MLLM) encoder with a diffusion decoder, as seen in systems such as Step1X-Edit and…

Computer Vision and Pattern Recognition · Computer Science 2025-12-02 Fukun Yin , Shiyu Liu , Yucheng Han , Zhibo Wang , Peng Xing , Rui Wang , Wei Cheng , Yingming Wang , Aojie Li , Zixin Yin , Pengtao Chen , Xiangyu Zhang , Daxin Jiang , Xianfang Zeng , Gang Yu

Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. However, a primary limitation lies in the significant computational demands during training,…

The model editing problem concerns how language models should learn new facts about the world over time. While empirical research on model editing has drawn widespread attention, the conceptual foundations of model editing remain shaky --…

Computation and Language · Computer Science 2024-06-28 Peter Hase , Thomas Hofweber , Xiang Zhou , Elias Stengel-Eskin , Mohit Bansal

Instruction-based image editing (IIE) has advanced rapidly with the success of diffusion models. However, existing efforts primarily focus on simple and explicit instructions to execute editing operations such as adding, deleting, moving,…

Computer Vision and Pattern Recognition · Computer Science 2026-05-14 Qingdong He , Xueqin Chen , Chaoyi Wang , Yanjie Pan , Xiaobin Hu , Zhenye Gan , Yabiao Wang , Chengjie Wang , Xiangtai Li , Jiangning Zhang

Reasoning is central to human intelligence, enabling structured problem-solving across diverse tasks. Recent advances in large language models (LLMs) have greatly enhanced their reasoning abilities in arithmetic, commonsense, and symbolic…

Large Multi-modality Models (LMMs) have made significant progress in visual understanding and generation, but they still face challenges in General Visual Editing, particularly in following complex instructions, preserving appearance…

Computer Vision and Pattern Recognition · Computer Science 2025-05-28 Xiangyu Zhao , Peiyuan Zhang , Kexian Tang , Xiaorong Zhu , Hao Li , Wenhao Chai , Zicheng Zhang , Renqiu Xia , Guangtao Zhai , Junchi Yan , Hua Yang , Xue Yang , Haodong Duan

Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which…

Computation and Language · Computer Science 2023-12-01 Yunzhi Yao , Peng Wang , Bozhong Tian , Siyuan Cheng , Zhoubo Li , Shumin Deng , Huajun Chen , Ningyu Zhang

Large language models (LLMs) often exhibit flawed reasoning ability that undermines reliability. Existing approaches to improving reasoning typically treat it as a general and monolithic skill, applying broad training which is inefficient…

Computation and Language · Computer Science 2026-03-10 Zhenyu Lei , Qiong Wu , Jianxiong Dong , Yinhan He , Emily Dodwell , Yushun Dong , Jundong Li

Recently, knowledge editing on large language models (LLMs) has received considerable attention. Compared to this, editing Large Vision-Language Models (LVLMs) faces extra challenges from diverse data modalities and complicated model…

Computation and Language · Computer Science 2024-10-30 Han Huang , Haitian Zhong , Tao Yu , Qiang Liu , Shu Wu , Liang Wang , Tieniu Tan

Text rendering has recently emerged as one of the most challenging frontiers in visual generation, drawing significant attention from large-scale diffusion and multimodal models. However, text editing within images remains largely…

Computer Vision and Pattern Recognition · Computer Science 2025-12-19 Rui Gui , Yang Wan , Haochen Han , Dongxing Mao , Fangming Liu , Min Li , Alex Jinpeng Wang

Large Language Models (LLMs) demonstrate exceptional capabilities in factual question answering, yet they sometimes provide incorrect responses. To address this issue, knowledge editing techniques have emerged as effective methods for…

Human-Computer Interaction · Computer Science 2026-04-01 Zhenning Chen , Hanbei Zhan , Yanwei Huang , Xin Wu , Dazhen Deng , Di Weng , Yingcai Wu

The increasing demand for intelligent systems capable of interpreting and reasoning about visual content requires the development of large Vision-and-Language Models (VLMs) that are not only accurate but also have explicit reasoning…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Kohei Uehara , Nabarun Goswami , Hanqin Wang , Toshiaki Baba , Kohtaro Tanaka , Tomohiro Hashimoto , Kai Wang , Rei Ito , Takagi Naoya , Ryo Umagami , Yingyi Wen , Tanachai Anakewat , Tatsuya Harada

In this article, we investigate vision-language models (VLM) as reasoners. The ability to form abstractions underlies mathematical reasoning, problem-solving, and other Math AI tasks. Several formalisms have been given to these underlying…

Artificial Intelligence · Computer Science 2024-07-08 Denisa Roberts , Lucas Roberts

Vision-language models (VLMs) show promise for autonomous driving but often lack transparent reasoning capabilities that are critical for safety. We investigate whether explicitly modeling reasoning during fine-tuning enhances VLM…

Computer Vision and Pattern Recognition · Computer Science 2025-04-16 Amirhosein Chahe , Lifeng Zhou

Recent advances in AI-generated content (AIGC) have significantly accelerated image editing techniques, driving increasing demand for diverse and fine-grained edits. Despite these advances, existing image editing methods still face…

Computer Vision and Pattern Recognition · Computer Science 2025-05-27 Shuyu Wang , Weiqi Li , Qian Wang , Shijie Zhao , Jian Zhang
‹ Prev 1 2 3 10 Next ›