English
Related papers

Related papers: Diffusion Explainer: Visual Explanation for Text-t…

200 papers

Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex internal structures and operations often pose challenges for non-experts to grasp. We introduce…

Human-Computer Interaction · Computer Science 2024-04-26 Seongmin Lee , Benjamin Hoover , Hendrik Strobelt , Zijie J. Wang , ShengYun Peng , Austin Wright , Kevin Li , Haekyu Park , Haoyang Yang , Polo Chau

The Stable Diffusion model is a prominent text-to-image generation model that relies on a text prompt as its input, which is encoded using the Contrastive Language-Image Pre-Training (CLIP). However, text prompts have limitations when it…

Computer Vision and Pattern Recognition · Computer Science 2024-02-16 Yuxuan Ding , Chunna Tian , Haoxuan Ding , Lingqiao Liu

Text-to-image diffusion models have demonstrated an unparalleled ability to generate high-quality, diverse images from a textual prompt. However, the internal representations learned by these models remain an enigma. In this work, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-06 Hila Chefer , Oran Lang , Mor Geva , Volodymyr Polosukhin , Assaf Shocher , Michal Irani , Inbar Mosseri , Lior Wolf

Diffusion models, such as Stable Diffusion, have shown incredible performance on text-to-image generation. Since text-to-image generation often requires models to generate visual concepts with fine-grained details and attributes specified…

Computer Vision and Pattern Recognition · Computer Science 2024-04-26 Xuehai He , Weixi Feng , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , William Yang Wang , Xin Eric Wang

Diffusion models have gained increasing attention for their impressive generation abilities but currently struggle with rendering accurate and coherent text. To address this issue, we introduce TextDiffuser, focusing on generating images…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Jingye Chen , Yupan Huang , Tengchao Lv , Lei Cui , Qifeng Chen , Furu Wei

The astonishing growth of generative tools in recent years has empowered many exciting applications in text-to-image generation and text-to-video generation. The underlying principle behind these generative tools is the concept of…

Machine Learning · Computer Science 2025-01-09 Stanley H. Chan

Diffusion-based models, such as the Stable Diffusion model, have revolutionized text-to-image synthesis with their ability to produce high-quality, high-resolution images. These advancements have prompted significant progress in image…

Cryptography and Security · Computer Science 2023-12-07 Ali Naseh , Jaechul Roh , Amir Houmansadr

With recent advancements in diffusion models, users can generate high-quality images by writing text prompts in natural language. However, generating images with desired details requires proper prompts, and it is often unclear how a model…

Computer Vision and Pattern Recognition · Computer Science 2023-07-07 Zijie J. Wang , Evan Montoya , David Munechika , Haoyang Yang , Benjamin Hoover , Duen Horng Chau

We generate synthetic images with the "Stable Diffusion" image generation model using the Wordnet taxonomy and the definitions of concepts it contains. This synthetic image database can be used as training data for data augmentation in…

Computer Vision and Pattern Recognition · Computer Science 2022-11-07 Andreas Stöckl

GUI (graphical user interface) prototyping is a widely-used technique in requirements engineering for gathering and refining requirements, reducing development risks and increasing stakeholder engagement. However, GUI prototyping can be a…

Software Engineering · Computer Science 2023-10-05 Jialiang Wei , Anne-Lise Courbis , Thomas Lambolais , Binbin Xu , Pierre Louis Bernard , Gérard Dray

Text-to-image diffusion models have recently attracted the interest of many researchers, and inverting the diffusion process can play an important role in better understanding the generative process and how to engineer prompts in order to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

We present DiffExplainer, a novel framework that, leveraging language-vision models, enables multimodal global explainability. DiffExplainer employs diffusion models conditioned on optimized text prompts, synthesizing images that maximize…

Computer Vision and Pattern Recognition · Computer Science 2024-04-04 Matteo Pennisi , Giovanni Bellitto , Simone Palazzo , Mubarak Shah , Concetto Spampinato

Diffusion models have demonstrated remarkable performance in generation tasks. Nevertheless, explaining the diffusion process remains challenging due to it being a sequence of denoising noisy images that are difficult for experts to…

Computer Vision and Pattern Recognition · Computer Science 2024-02-19 Ji-Hoon Park , Yeong-Joon Ju , Seong-Whan Lee

The text-to-image synthesis by diffusion models has recently shown remarkable performance in generating high-quality images. Although performs well for simple texts, the models may get confused when faced with complex texts that contain…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Chang Yu , Junran Peng , Xiangyu Zhu , Zhaoxiang Zhang , Qi Tian , Zhen Lei

Diffusion-based methods can generate realistic images and videos, but they struggle to edit existing objects in a video while preserving their appearance over time. This prevents diffusion models from being applied to natural video editing…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Wenhao Chai , Xun Guo , Gaoang Wang , Yan Lu

Diffusion models have been central to the development of recent image, video, and even text generation systems. They posses striking geometric properties that can be faithfully portrayed in low-dimensional settings. However, existing…

Machine Learning · Computer Science 2025-07-08 Alec Helbling , Duen Horng Chau

Text-to-image diffusion models (T2I) use a latent representation of a text prompt to guide the image generation process. However, the process by which the encoder produces the text representation is unknown. We propose the Diffusion Lens, a…

Computer Vision and Pattern Recognition · Computer Science 2025-03-04 Michael Toker , Hadas Orgad , Mor Ventura , Dana Arad , Yonatan Belinkov

Recent progress in text-to-image (TTI) systems, such as StableDiffusion, Imagen, and DALL-E 2, have made it possible to create realistic images with simple text prompts. It is tempting to use these systems to eliminate the manual task of…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 David Marwood , Shumeet Baluja , Yair Alon

Diffusion-based text-to-image generative models, e.g., Stable Diffusion, have revolutionized the field of content generation, enabling significant advancements in areas like image editing and video synthesis. Despite their formidable…

Computer Vision and Pattern Recognition · Computer Science 2024-03-29 Yanyu Li , Xian Liu , Anil Kag , Ju Hu , Yerlan Idelbayev , Dhritiman Sagar , Yanzhi Wang , Sergey Tulyakov , Jian Ren

This paper explores the task of detecting images generated by text-to-image diffusion models. To evaluate this, we consider images generated from captions in the MSCOCO and Wikimedia datasets using two state-of-the-art models: Stable…

Computer Vision and Pattern Recognition · Computer Science 2023-04-24 Davide Alessandro Coccomini , Andrea Esuli , Fabrizio Falchi , Claudio Gennaro , Giuseppe Amato
‹ Prev 1 2 3 10 Next ›