English
Related papers

Related papers: ClickDiffusion: Harnessing LLMs for Interactive Pr…

200 papers

Natural language offers a highly intuitive interface for image editing. In this paper, we introduce the first solution for performing local (region-based) edits in generic natural images, based on a natural language description along with…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Omri Avrahami , Dani Lischinski , Ohad Fried

Text-guided image editing has recently experienced rapid development. However, simultaneously performing multiple editing actions on a single image, such as background replacement and specific subject attribute changes, while maintaining…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Pengzhi Li , QInxuan Huang , Yikang Ding , Zhiheng Li

Machine learning has enabled the development of powerful systems capable of editing images from natural language instructions. However, in many common scenarios it is difficult for users to specify precise image transformations with text…

Artificial Intelligence · Computer Science 2024-02-14 Alec Helbling , Seongmin Lee , Polo Chau

Instruction-based image editing has made a great process in using natural human language to manipulate the visual content of images. However, existing models are limited by the quality of the dataset and cannot accurately localize editing…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Tiancheng Li , Jinxiu Liu , Huajun Chen , Qi Liu

Despite the ability of existing large-scale text-to-image (T2I) models to generate high-quality images from detailed textual descriptions, they often lack the ability to precisely edit the generated or real images. In this paper, we propose…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

Recent large-scale text-guided diffusion models provide powerful image-generation capabilities. Currently, a significant effort is given to enable the modification of these images using text only as means to offer intuitive and versatile…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Linoy Tsaban , Apolinário Passos

Research in vision-language models has seen rapid developments off-late, enabling natural language-based interfaces for image generation and manipulation. Many existing text guided manipulation techniques are restricted to specific classes…

Computer Vision and Pattern Recognition · Computer Science 2024-05-07 Paramanand Chandramouli , Kanchana Vaishnavi Gandikota

In the text-to-image generation field, recent remarkable progress in Stable Diffusion makes it possible to generate rich kinds of novel photorealistic images. However, current models still face misalignment issues (e.g., problematic spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-08-15 Leigang Qu , Shengqiong Wu , Hao Fei , Liqiang Nie , Tat-Seng Chua

Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we…

Computer Vision and Pattern Recognition · Computer Science 2023-10-19 Haoxing Chen , Zhuoer Xu , Zhangxuan Gu , Jun Lan , Xing Zheng , Yaohui Li , Changhua Meng , Huijia Zhu , Weiqiang Wang

Recent advances in diffusion models enable many powerful instruments for image editing. One of these instruments is text-driven image manipulations: editing semantic attributes of an image according to the provided text description. %…

Computer Vision and Pattern Recognition · Computer Science 2023-04-11 Nikita Starodubcev , Dmitry Baranchuk , Valentin Khrulkov , Artem Babenko

We introduce the new task of generating Illustrated Instructions, i.e., visual instructions customized to a user's needs. We identify desiderata unique to this task, and formalize it through a suite of automatic and human evaluation…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Sachit Menon , Ishan Misra , Rohit Girdhar

While language-guided image manipulation has made remarkable progress, the challenge of how to instruct the manipulation process faithfully reflecting human intentions persists. An accurate and comprehensive description of a manipulation…

Computer Vision and Pattern Recognition · Computer Science 2023-08-03 Yasheng Sun , Yifan Yang , Houwen Peng , Yifei Shen , Yuqing Yang , Han Hu , Lili Qiu , Hideki Koike

Text-driven person image generation is an emerging and challenging task in cross-modality image generation. Controllable person image generation promotes a wide range of applications such as digital human interaction and virtual try-on.…

Computer Vision and Pattern Recognition · Computer Science 2022-11-14 Kaiduo Zhang , Muyi Sun , Jianxin Sun , Binghao Zhao , Kunbo Zhang , Zhenan Sun , Tieniu Tan

Generative models have enabled intuitive image creation and manipulation using natural language. In particular, diffusion models have recently shown remarkable results for natural image editing. In this work, we propose to apply diffusion…

Text-based semantic image editing assumes the manipulation of an image using a natural language instruction. Although recent works are capable of generating creative and qualitative images, the problem is still mostly approached as a black…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Maria Mihaela Trusca , Tinne Tuytelaars , Marie-Francine Moens

Text-to-image generative models have made remarkable advancements in generating high-quality images. However, generated images often contain undesirable artifacts or other errors due to model limitations. Existing techniques to fine-tune…

Computer Vision and Pattern Recognition · Computer Science 2023-10-31 Peyman Gholami , Robert Xiao

Text-to-image generation has witnessed significant progress with the advent of diffusion models. Despite the ability to generate photorealistic images, current text-to-image diffusion models still often struggle to accurately interpret and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Tsung-Han Wu , Long Lian , Joseph E. Gonzalez , Boyi Li , Trevor Darrell

With the rise of large, publicly-available text-to-image diffusion models, text-guided real image editing has garnered much research attention recently. Existing methods tend to either rely on some form of per-instance or per-task…

Computer Vision and Pattern Recognition · Computer Science 2022-11-16 Adham Elarabawy , Harish Kamath , Samuel Denton

The advancement of text-to-image synthesis has introduced powerful generative models capable of creating realistic images from textual prompts. However, precise control over image attributes remains challenging, especially at the instance…

Computer Vision and Pattern Recognition · Computer Science 2025-01-27 Andrey Palaev , Adil Khan , Syed M. Ahsan Kazmi

Thanks to the powerful language comprehension capabilities of Large Language Models (LLMs), existing instruction-based image editing methods have introduced Multimodal Large Language Models (MLLMs) to promote information exchange between…

Computer Vision and Pattern Recognition · Computer Science 2026-01-06 Yujie Hu , Zecheng Tang , Xu Jiang , Weiqi Li , Jian Zhang
‹ Prev 1 2 3 10 Next ›