English
Related papers

Related papers: Compositional Inversion for Stable Diffusion Model…

200 papers

Style transfer, a pivotal task in image processing, synthesizes visually compelling images by seamlessly blending realistic content with artistic styles, enabling applications in photo editing and creative design. While mainstream…

Computer Vision and Pattern Recognition · Computer Science 2025-11-27 Yingying Deng , Xiangyu He , Fan Tang , Weiming Dong , Xucheng Yin

Text-to-image generative models have enabled high-resolution image synthesis across different domains, but require users to specify the content they wish to generate. In this paper, we consider the inverse problem -- given a collection of…

Computer Vision and Pattern Recognition · Computer Science 2023-08-04 Nan Liu , Yilun Du , Shuang Li , Joshua B. Tenenbaum , Antonio Torralba

We propose CatVersion, an inversion-based method that learns the personalized concept through a handful of examples. Subsequently, users can utilize text prompts to generate images that embody the personalized concept, thereby achieving…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Ruoyu Zhao , Mingrui Zhu , Shiyin Dong , Nannan Wang , Xinbo Gao

Recent advances in text-to-image diffusion models have enabled the photorealistic generation of images from text prompts. Despite the great progress, existing models still struggle to generate compositional multi-concept images naturally,…

Computer Vision and Pattern Recognition · Computer Science 2023-10-12 Hazarapet Tunanyan , Dejia Xu , Shant Navasardyan , Zhangyang Wang , Humphrey Shi

Text-to-image generation models~(e.g., Stable Diffusion) have achieved significant advancements, enabling the creation of high-quality and realistic images based on textual descriptions. Prompt inversion, the task of identifying the textual…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Mingzhe Li , Kejing Xia , Gehao Zhang , Zhenting Wang , Guanhong Tao , Siqi Pan , Juan Zhai , Shiqing Ma

Text-to-image diffusion models sometimes depict blended concepts in the generated images. One promising use case of this effect would be the nonword-to-image generation task which attempts to generate images intuitively imaginable from a…

Multimedia · Computer Science 2024-11-07 Chihaya Matsuhira , Marc A. Kastner , Takahiro Komamizu , Takatsugu Hirayama , Ichiro Ide

The recent demand for customized image generation raises a need for techniques that effectively extract the common concept from small sets of images. Existing methods typically rely on additional guidance, such as text prompts or spatial…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Minseo Kim , Minchan Kwon , Dongyeun Lee , Yunho Jeon , Junmo Kim

Text-to-image (T2I) customization aims to create images that embody specific visual concepts delineated in textual descriptions. However, existing works still face a main challenge, concept overfitting. To tackle this challenge, we first…

Computer Vision and Pattern Recognition · Computer Science 2024-04-23 Weili Zeng , Yichao Yan , Qi Zhu , Zhuo Chen , Pengzhi Chu , Weiming Zhao , Xiaokang Yang

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images.They either finetune the model, or invert the image in the latent space of the pretrained model. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-09 Senmao Li , Joost van de Weijer , Taihang Hu , Fahad Shahbaz Khan , Qibin Hou , Yaxing Wang , Jian Yang , Ming-Ming Cheng

Text-to-image diffusion models have shown impressive capabilities in generating realistic visuals from natural-language prompts, yet they often struggle with accurately binding attributes to corresponding objects, especially in prompts…

Computer Vision and Pattern Recognition · Computer Science 2025-05-05 Do Huu Dat , Nam Hyeonu , Po-Yuan Mao , Tae-Hyun Oh

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Nupur Kumari , Bingliang Zhang , Richard Zhang , Eli Shechtman , Jun-Yan Zhu

Customization techniques for text-to-image models have paved the way for a wide range of previously unattainable applications, enabling the generation of specific concepts across diverse contexts and styles. While existing methods…

Computer Vision and Pattern Recognition · Computer Science 2024-12-06 Ryan Po , Guandao Yang , Kfir Aberman , Gordon Wetzstein

Diffusion models have shown superior performance in image generation and manipulation, but the inherent stochasticity presents challenges in preserving and manipulating image content and identity. While previous approaches like DreamBooth…

Computer Vision and Pattern Recognition · Computer Science 2023-04-20 Inhwa Han , Serin Yang , Taesung Kwon , Jong Chul Ye

We explore the oscillatory behavior observed in inversion methods applied to large-scale text-to-image diffusion models, with a focus on the "Flux" model. By employing a fixed-point-inspired iterative approach to invert real-world images,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Yan Zheng , Zhenxiao Liang , Xiaoyan Cong , Lanqing guo , Yuehao Wang , Peihao Wang , Zhangyang Wang

In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these…

Computer Vision and Pattern Recognition · Computer Science 2019-04-09 Zhaoyang Lv , Frank Dellaert , James M. Rehg , Andreas Geiger

Text-to-image diffusion models have recently attracted the interest of many researchers, and inverting the diffusion process can play an important role in better understanding the generative process and how to engineer prompts in order to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-22 Florinel-Alin Croitoru , Vlad Hondru , Radu Tudor Ionescu , Mubarak Shah

Subject-driven text-to-image diffusion models empower users to tailor the model to new concepts absent in the pre-training dataset using a few sample images. However, prevalent subject-driven models primarily rely on single-concept input…

Computer Vision and Pattern Recognition · Computer Science 2024-02-16 Junjie Shentu , Matthew Watson , Noura Al Moubayed

Recent text-guided diffusion models provide powerful image generation capabilities. Currently, a massive effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. To edit a…

Computer Vision and Pattern Recognition · Computer Science 2022-11-18 Ron Mokady , Amir Hertz , Kfir Aberman , Yael Pritch , Daniel Cohen-Or

Diffusion models have enabled high-quality, conditional image editing capabilities. We propose to expand their arsenal, and demonstrate that off-the-shelf diffusion models can be used for a wide range of cross-domain compositing tasks.…

Computer Vision and Pattern Recognition · Computer Science 2023-05-26 Roy Hachnochi , Mingrui Zhao , Nadav Orzech , Rinon Gal , Ali Mahdavi-Amiri , Daniel Cohen-Or , Amit Haim Bermano

Text-driven diffusion models have significantly advanced the image editing performance by using text prompts as inputs. One crucial step in text-driven image editing is to invert the original image into a latent noise code conditioned on…

Computer Vision and Pattern Recognition · Computer Science 2024-07-08 Ruibin Li , Ruihuang Li , Song Guo , Lei Zhang
‹ Prev 1 2 3 10 Next ›