English
Related papers

Related papers: MultiDiffusion: Fusing Diffusion Paths for Control…

200 papers

While modern diffusion models excel at generating high-quality and diverse images, they still struggle with high-fidelity compositional and multimodal control, particularly when users simultaneously specify text prompts, subject references,…

Computer Vision and Pattern Recognition · Computer Science 2025-11-27 Yusuf Dalva , Guocheng Gordon Qian , Maya Goldenberg , Tsai-Shien Chen , Kfir Aberman , Sergey Tulyakov , Pinar Yanardag , Kuan-Chieh Jackson Wang

The recent popularity of text-to-image diffusion models (DM) can largely be attributed to the intuitive interface they provide to users. The intended generation can be expressed in natural language, with the model producing faithful…

Recently, the multimedia community has witnessed the rise of diffusion models trained on large-scale multi-modal data for visual content creation, particularly in the field of text-to-image generation. In this paper, we propose a new task…

Computer Vision and Pattern Recognition · Computer Science 2023-11-10 Jingwen Chen , Yingwei Pan , Ting Yao , Tao Mei

In recent years, significant progress has been made in the development of text-to-image generation models. However, these models still face limitations when it comes to achieving full controllability during the generation process. Often,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-05 Salaheldin Mohamed

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

Large-scale text-to-image generative models have been a revolutionary breakthrough in the evolution of generative AI, allowing us to synthesize diverse images that convey highly complex visual concepts. However, a pivotal challenge in…

Computer Vision and Pattern Recognition · Computer Science 2022-11-24 Narek Tumanyan , Michal Geyer , Shai Bagon , Tali Dekel

Diffusion models have emerged as frontrunners in text-to-image generation, but their fixed image resolution during training often leads to challenges in high-resolution image generation, such as semantic deviations and object replication.…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Haoning Wu , Shaocheng Shen , Qiang Hu , Xiaoyun Zhang , Ya Zhang , Yanfeng Wang

Large diffusion-based Text-to-Image (T2I) models have shown impressive generative powers for text-to-image generation as well as spatially conditioned image generation. For most applications, we can train the model end-toend with paired…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Nithin Gopalakrishnan Nair , Jeya Maria Jose Valanarasu , Vishal M Patel

Recently, diffusion models have achieved great success in image synthesis. However, when it comes to the layout-to-image generation where an image often has a complex scene of multiple objects, how to make strong control over both the…

Computer Vision and Pattern Recognition · Computer Science 2024-03-13 Guangcong Zheng , Xianpan Zhou , Xuewei Li , Zhongang Qi , Ying Shan , Xi Li

Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry, appearance, motion, and camera path. Creating computer-generated videos, however, is a tedious…

Computer Vision and Pattern Recognition · Computer Science 2023-12-05 Shengqu Cai , Duygu Ceylan , Matheus Gadelha , Chun-Hao Paul Huang , Tuanfeng Yang Wang , Gordon Wetzstein

Large-scale generative models, such as text-to-image diffusion models, have garnered widespread attention across diverse domains due to their creative and high-fidelity image generation. Nonetheless, existing large-scale diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2024-08-28 Younghyun Kim , Geunmin Hwang , Junyu Zhang , Eunbyung Park

Diffusion methods have been proven to be very effective to generate images while conditioning on a text prompt. However, and although the quality of the generated images is unprecedented, these methods seem to struggle when trying to…

Computer Vision and Pattern Recognition · Computer Science 2023-02-07 Álvaro Barbero Jiménez

This paper introduces MVDiffusion, a simple yet effective method for generating consistent multi-view images from text prompts given pixel-to-pixel correspondences (e.g., perspective crops from a panorama or multi-view images given depth…

Computer Vision and Pattern Recognition · Computer Science 2023-12-27 Shitao Tang , Fuyang Zhang , Jiacheng Chen , Peng Wang , Yasutaka Furukawa

Latent diffusion models excel at producing high-quality images from text. Yet, concerns appear about the lack of diversity in the generated imagery. To tackle this, we introduce Diverse Diffusion, a method for boosting image diversity…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Mariia Zameshina , Olivier Teytaud , Laurent Najman

Generating high-resolution images with generative models has recently been made widely accessible by leveraging diffusion models pre-trained on large-scale datasets. Various techniques, such as MultiDiffusion and SyncDiffusion, have further…

Computer Vision and Pattern Recognition · Computer Science 2025-01-08 Stanislav Frolov , Brian B. Moser , Andreas Dengel

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Nupur Kumari , Bingliang Zhang , Richard Zhang , Eli Shechtman , Jun-Yan Zhu

Diffusion generative models have recently greatly improved the power of text-conditioned image generation. Existing image generation models mainly include text conditional diffusion model and cross-modal guided diffusion model, which are…

Computer Vision and Pattern Recognition · Computer Science 2022-11-04 Wei Li , Xue Xu , Xinyan Xiao , Jiachen Liu , Hu Yang , Guohao Li , Zhanpeng Wang , Zhifan Feng , Qiaoqiao She , Yajuan Lyu , Hua Wu

Image composition targets at synthesizing a realistic composite image from a pair of foreground and background images. Recently, generative composition methods are built on large pretrained diffusion models to generate composite images,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-22 Bo Zhang , Yuxuan Duan , Jun Lan , Yan Hong , Huijia Zhu , Weiqiang Wang , Li Niu

State-of-the-art diffusion models can generate highly realistic images based on various conditioning like text, segmentation, and depth. However, an essential aspect often overlooked is the specific camera geometry used during image…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Andrey Voynov , Amir Hertz , Moab Arar , Shlomi Fruchter , Daniel Cohen-Or

Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding. Despite recent strides in language-guided image synthesis using diffusion models, current…

Computer Vision and Pattern Recognition · Computer Science 2024-05-22 Shubham Paliwal , Arushi Jain , Monika Sharma , Vikram Jamwal , Lovekesh Vig
‹ Prev 1 2 3 10 Next ›