English
Related papers

Related papers: Diffexplainer: Towards Cross-modal Global Explanat…

200 papers

Classifiers are important components in many computer vision tasks, serving as the foundational backbone of a wide variety of models employed across diverse applications. However, understanding the decision-making process of classifiers…

Computer Vision and Pattern Recognition · Computer Science 2024-12-25 Tahira Kazimi , Ritika Allada , Pinar Yanardag

Understanding and explaining the behavior of machine learning models is essential for building transparent and trustworthy AI systems. We introduce DEXTER, a data-free framework that employs diffusion models and large language models to…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Simone Carnemolla , Matteo Pennisi , Sarinda Samarasinghe , Giovanni Bellitto , Simone Palazzo , Daniela Giordano , Mubarak Shah , Concetto Spampinato

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex structures and operations often pose challenges for non-experts to grasp. We present Diffusion…

Diffusion models have gained tremendous success in text-to-image generation, yet still lag behind with visual understanding tasks, an area dominated by autoregressive vision-language models. We propose a large-scale and fully end-to-end…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Zijie Li , Henry Li , Yichun Shi , Amir Barati Farimani , Yuval Kluger , Linjie Yang , Peng Wang

While recent Multimodal Large Language Models (MLLMs) have attained significant strides in multimodal reasoning, their reasoning processes remain predominantly text-centric, leading to suboptimal performance in complex long-horizon,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Zefeng He , Xiaoye Qu , Yafu Li , Tong Zhu , Siyuan Huang , Yu Cheng

In recent years, deep learning models have been extensively applied to biological data across various modalities. Discriminative deep learning models have excelled at classifying images into categories (e.g., healthy versus diseased,…

Computer Vision and Pattern Recognition · Computer Science 2025-02-17 Anis Bourou , Saranga Kingkor Mahanta , Thomas Boyer , Valérie Mezger , Auguste Genovesio

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex internal structures and operations often pose challenges for non-experts to grasp. We introduce…

Human-Computer Interaction · Computer Science 2024-04-26 Seongmin Lee , Benjamin Hoover , Hendrik Strobelt , Zijie J. Wang , ShengYun Peng , Austin Wright , Kevin Li , Haekyu Park , Haoyang Yang , Polo Chau

Multimodal recommendation systems integrate diverse multimodal information into the feature representations of both items and users, thereby enabling a more comprehensive modeling of user preferences. However, existing methods are hindered…

Multimedia · Computer Science 2025-01-03 Qiya Song , Jiajun Hu , Lin Xiao , Bin Sun , Xieping Gao , Shutao Li

Beyond high-fidelity image synthesis, diffusion models have recently exhibited promising results in dense visual perception tasks. However, most existing work treats diffusion models as a standalone component for perception tasks, employing…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Shuhong Zheng , Zhipeng Bao , Ruoyu Zhao , Martial Hebert , Yu-Xiong Wang

Recent advancements in Language Models (LMs) have demonstrated strong semantic reasoning capabilities, enabling their application in high-level decision-making for autonomous driving (AD). However, LMs operate over discrete token spaces and…

Robotics · Computer Science 2026-04-02 Fan Ding , Xuewen Luo , Fengze Yang , Bo Yu , HwaHui Tew , Ganesh Krishnasamy , Junn Yong Loo

Discriminative classifiers have become a foundational tool in deep learning for medical imaging, excelling at learning separable features of complex data distributions. However, these models often need careful design, augmentation, and…

Computer Vision and Pattern Recognition · Computer Science 2025-08-11 Gian Mario Favero , Parham Saremi , Emily Kaczmarek , Brennan Nichyporuk , Tal Arbel

Diffusion models have made significant strides in language-driven and layout-driven image generation. However, most diffusion models are limited to visible RGB image generation. In fact, human perception of the world is enriched by diverse…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Zeyu Wang , Jingyu Lin , Yifei Qian , Yi Huang , Shicen Tian , Bosong Chai , Juncan Deng , Qu Yang , Lan Du , Cunjian Chen , Kejie Huang

Visual counterfactual explanations are ideal hypothetical images that change the decision-making of the classifier with high confidence toward the desired class while remaining visually plausible and close to the initial image. In this…

Computer Vision and Pattern Recognition · Computer Science 2025-04-15 Tung Luu , Nam Le , Duc Le , Bac Le

Deep generative models like VAEs and diffusion models have advanced various generation tasks by leveraging latent variables to learn data distributions and generate high-quality samples. Despite the field of explainable AI making strides in…

Machine Learning · Computer Science 2025-12-22 Mengdan Zhu , Raasikh Kanjiani , Jiahui Lu , Andrew Choi , Qirui Ye , Liang Zhao

The diffusion model has been proven a powerful generative model in recent years, yet remains a challenge in generating visual text. Several methods alleviated this issue by incorporating explicit text position and content as guidance on…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Jingye Chen , Yupan Huang , Tengchao Lv , Lei Cui , Qifeng Chen , Furu Wei

In the field of medical imaging, particularly in tasks related to early disease detection and prognosis, understanding the reasoning behind AI model predictions is imperative for assessing their reliability. Conventional explanation methods…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Yingying Fang , Shuang Wu , Zihao Jin , Caiwen Xu , Shiyi Wang , Simon Walsh , Guang Yang

Image generation has recently seen tremendous advances, with diffusion models allowing to synthesize convincing images for a large variety of text prompts. In this article, we propose DiffEdit, a method to take advantage of text-conditioned…

Computer Vision and Pattern Recognition · Computer Science 2022-10-21 Guillaume Couairon , Jakob Verbeek , Holger Schwenk , Matthieu Cord

Generating high-quality and person-generic visual dubbing remains a challenge. Recent innovation has seen the advent of a two-stage paradigm, decoupling the rendering and lip synchronization process facilitated by intermediate…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Tao Liu , Chenpeng Du , Shuai Fan , Feilong Chen , Kai Yu

While many unsupervised learning models focus on one family of tasks, either generative or discriminative, we explore the possibility of a unified representation learner: a model which addresses both families of tasks simultaneously. We…

Computer Vision and Pattern Recognition · Computer Science 2024-09-25 Soumik Mukhopadhyay , Matthew Gwilliam , Yosuke Yamaguchi , Vatsal Agarwal , Namitha Padmanabhan , Archana Swaminathan , Tianyi Zhou , Jun Ohya , Abhinav Shrivastava

Diffusion models have demonstrated remarkable performance in generation tasks. Nevertheless, explaining the diffusion process remains challenging due to it being a sequence of denoising noisy images that are difficult for experts to…

Computer Vision and Pattern Recognition · Computer Science 2024-02-19 Ji-Hoon Park , Yeong-Joon Ju , Seong-Whan Lee
‹ Prev 1 2 3 10 Next ›