English
Related papers

Related papers: LogoSticker: Inserting Logos into Diffusion Models…

200 papers

Text-to-image diffusion models produce impressive results but are frustrating tools for artists who desire fine-grained control. For example, a common use case is to create images of a specific instance in novel contexts, i.e.,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-28 Shengqu Cai , Eric Chan , Yunzhi Zhang , Leonidas Guibas , Jiajun Wu , Gordon Wetzstein

Recent advances in text-to-image generation have been remarkable, but generating multilingual design logos that harmoniously integrate visual and textual elements remains a challenging task. Existing methods often distort character geometry…

Computer Vision and Pattern Recognition · Computer Science 2026-03-11 Mingyu Kang , Hyein Seo , Yuna Jeong , Junhyeong Park , Yong Suk Choi

We introduce Calligrapher, a novel diffusion-based framework that innovatively integrates advanced text customization with artistic typography for digital calligraphy and design applications. Addressing the challenges of precise style…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Yue Ma , Qingyan Bai , Hao Ouyang , Ka Leong Cheng , Qiuyu Wang , Hongyu Liu , Zichen Liu , Haofan Wang , Jingye Chen , Yujun Shen , Qifeng Chen

Many applications can benefit from personalized image generation models, including image enhancement, video conferences, just to name a few. Existing works achieved personalization by fine-tuning one model for each person. While being…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Yu-Chuan Su , Kelvin C. K. Chan , Yandong Li , Yang Zhao , Han Zhang , Boqing Gong , Huisheng Wang , Xuhui Jia

Recently, large-scale diffusion models, e.g., Stable diffusion and DallE2, have shown remarkable results on image synthesis. On the other hand, large-scale cross-modal pre-trained models (e.g., CLIP, ALIGN, and FILIP) are competent for…

Computer Vision and Pattern Recognition · Computer Science 2023-08-21 Runhui Huang , Jianhua Han , Guansong Lu , Xiaodan Liang , Yihan Zeng , Wei Zhang , Hang Xu

Advanced diffusion-based Text-to-Image (T2I) models, such as the Stable Diffusion Model, have made significant progress in generating diverse and high-quality images using text prompts alone. However, when non-famous users require…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Yang Li , Songlin Yang , Wei Wang , Jing Dong

Recent text-to-image personalization methods have shown great promise in teaching a diffusion model user-specified concepts given a few images for reusing the acquired concepts in a novel context. With massive efforts being dedicated to…

Computer Vision and Pattern Recognition · Computer Science 2024-10-31 Zhengyang Yu , Zhaoyuan Yang , Jing Zhang

The pre-trained text-image discriminative models, such as CLIP, has been explored for open-vocabulary semantic segmentation with unsatisfactory results due to the loss of crucial localization information and awareness of object shapes.…

Computer Vision and Pattern Recognition · Computer Science 2024-01-23 Jinglong Wang , Xiawei Li , Jing Zhang , Qingyuan Xu , Qin Zhou , Qian Yu , Lu Sheng , Dong Xu

Recent progress in text-to-image (TTI) systems, such as StableDiffusion, Imagen, and DALL-E 2, have made it possible to create realistic images with simple text prompts. It is tempting to use these systems to eliminate the manual task of…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 David Marwood , Shumeet Baluja , Yair Alon

Taking advantage of the many recent advances in deep learning, text-to-image generative models currently have the merit of attracting the general public attention. Two of these models, DALL-E 2 and Imagen, have demonstrated that highly…

Computer Vision and Pattern Recognition · Computer Science 2022-09-23 Robin Zbinden

Large-scale graphs with node attributes are increasingly common in various real-world applications. Creating synthetic, attribute-rich graphs that mirror real-world examples is crucial, especially for sharing graph data for analysis and…

Machine Learning · Computer Science 2024-10-17 Mufei Li , Eleonora Kreačić , Vamsi K. Potluru , Pan Li

Modern diffusion models have set the state-of-the-art in AI image generation. Their success is due, in part, to training on Internet-scale data which often includes copyrighted work. This prompts questions about the extent to which these…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Stephen Casper , Zifan Guo , Shreya Mogulothu , Zachary Marinov , Chinmay Deshpande , Rui-Jie Yew , Zheng Dai , Dylan Hadfield-Menell

Diffusion models excel at generating photo-realistic images but come with significant computational costs in both training and sampling. While various techniques address these computational challenges, a less-explored issue is designing an…

Computer Vision and Pattern Recognition · Computer Science 2025-09-16 Huangjie Zheng , Zhendong Wang , Jianbo Yuan , Guanghan Ning , Pengcheng He , Quanzeng You , Hongxia Yang , Mingyuan Zhou

Diffusion models have demonstrated exceptional capabilities in generating a broad spectrum of visual content, yet their proficiency in rendering text is still limited: they often generate inaccurate characters or words that fail to blend…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Jianyi Zhang , Yufan Zhou , Jiuxiang Gu , Curtis Wigington , Tong Yu , Yiran Chen , Tong Sun , Ruiyi Zhang

Tokenizing images into compact visual representations is a key step in learning efficient and high-quality image generative models. We present a simple diffusion tokenizer (DiTo) that learns compact visual representations for image…

Computer Vision and Pattern Recognition · Computer Science 2025-01-31 Yinbo Chen , Rohit Girdhar , Xiaolong Wang , Sai Saketh Rambhatla , Ishan Misra

We introduce Diff-Tracker, a novel approach for the challenging unsupervised visual tracking task leveraging the pre-trained text-to-image diffusion model. Our main idea is to leverage the rich knowledge encapsulated within the pre-trained…

Computer Vision and Pattern Recognition · Computer Science 2024-07-17 Zhengbo Zhang , Li Xu , Duo Peng , Hossein Rahmani , Jun Liu

Recent advancements in image synthesis are fueled by the advent of large-scale diffusion models. Yet, integrating realistic object visualizations seamlessly into new or existing backgrounds without extensive training remains a challenge.…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Phillip Mueller , Jannik Wiese , Ioan Craciun , Lars Mikelsons

Exquisite demand exists for customizing the pretrained large text-to-image model, $\textit{e.g.}$, Stable Diffusion, to generate innovative concepts, such as the users themselves. However, the newly-added concept from previous customization…

Computer Vision and Pattern Recognition · Computer Science 2023-06-02 Ge Yuan , Xiaodong Cun , Yong Zhang , Maomao Li , Chenyang Qi , Xintao Wang , Ying Shan , Huicheng Zheng

Generative models have enabled intuitive image creation and manipulation using natural language. In particular, diffusion models have recently shown remarkable results for natural image editing. In this work, we propose to apply diffusion…

Despite the rapid adoption of text-to-image (T2I) diffusion models, causal and representation-level analysis remains fragmented and largely limited to isolated probing techniques. To address this gap, we introduce DreamReader: a unified…

‹ Prev 1 2 3 10 Next ›