English
Related papers

Related papers: DiffuseKronA: A Parameter Efficient Fine-tuning Me…

200 papers

While diffusion model fine-tuning offers a powerful approach for customizing pre-trained models to generate specific objects, it frequently suffers from overfitting when training samples are limited, compromising both generalization…

Computer Vision and Pattern Recognition · Computer Science 2026-01-26 Vera Soboleva , Aibek Alanov , Andrey Kuznetsov , Konstantin Sobolev

Personalized text-to-image generation has gained significant attention for its capability to generate high-fidelity portraits of specific identities conditioned on user-defined prompts. Existing methods typically involve test-time…

Computer Vision and Pattern Recognition · Computer Science 2024-11-18 Yujia Wu , Yiming Shi , Jiwei Wei , Chengwei Sun , Yang Yang , Heng Tao Shen

Personalized diffusion models have shown remarkable success in Text-to-Image (T2I) generation by enabling the injection of user-defined concepts into diverse contexts. However, balancing concept fidelity with contextual alignment remains a…

Computer Vision and Pattern Recognition · Computer Science 2025-05-28 Shamil Ayupov , Maksim Nakhodnov , Anastasia Yaschenko , Andrey Kuznetsov , Aibek Alanov

Diffusion models have revolutionized text-to-image (T2I) synthesis, producing high-quality, photorealistic images. However, they still struggle to properly render the spatial relationships described in text prompts. To address the lack of…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Andrea Rigo , Luca Stornaiuolo , Mauro Martino , Bruno Lepri , Nicu Sebe

Personalizing a large-scale pretrained Text-to-Image (T2I) diffusion model is challenging as it typically struggles to make an appropriate trade-off between its training data distribution and the target distribution, i.e., learning a novel…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Shangyu Chen , Zizheng Pan , Jianfei Cai , Dinh Phung

With the advance of text-to-image (T2I) diffusion models (e.g., Stable Diffusion) and corresponding personalization techniques such as DreamBooth and LoRA, everyone can manifest their imagination into high-quality images at an affordable…

Computer Vision and Pattern Recognition · Computer Science 2024-02-09 Yuwei Guo , Ceyuan Yang , Anyi Rao , Zhengyang Liang , Yaohui Wang , Yu Qiao , Maneesh Agrawala , Dahua Lin , Bo Dai

Large-scale Text-to-Image (T2I) diffusion models have revolutionized image generation over the last few years. Although owning diverse and high-quality generation capabilities, translating these abilities to fine-grained image editing…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Chong Mou , Xintao Wang , Jiechong Song , Ying Shan , Jian Zhang

Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. However, these models lack the ability to mimic the appearance of subjects in a…

Computer Vision and Pattern Recognition · Computer Science 2023-03-16 Nataniel Ruiz , Yuanzhen Li , Varun Jampani , Yael Pritch , Michael Rubinstein , Kfir Aberman

The objective of personalization and stylization in text-to-image is to instruct a pre-trained diffusion model to analyze new concepts introduced by users and incorporate them into expected styles. Recently, parameter-efficient fine-tuning…

Computer Vision and Pattern Recognition · Computer Science 2024-03-13 Likun Li , Haoqi Zeng , Changpeng Yang , Haozhe Jia , Di Xu

We introduce ProLoRA, enabling zero-shot adaptation of parameter-efficient fine-tuning in text-to-image diffusion models. ProLoRA transfers pre-trained low-rank adjustments (e.g., LoRA) from a source to a target model without additional…

Artificial Intelligence · Computer Science 2025-06-06 Farzad Farhadzadeh , Debasmit Das , Shubhankar Borse , Fatih Porikli

Personalizing text-to-image diffusion models has traditionally relied on subject-specific fine-tuning approaches such as DreamBooth~\cite{ruiz2023dreambooth}, which are computationally expensive and slow at inference. Recent adapter- and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-06 Sagar Shrestha , Gopal Sharma , Luowei Zhou , Suren Kumar

Despite recent significant strides achieved by diffusion-based Text-to-Image (T2I) models, current systems are still less capable of ensuring decent compositional generation aligned with text prompts, particularly for the multi-object…

Computer Vision and Pattern Recognition · Computer Science 2024-02-01 Zhipeng Bao , Yijun Li , Krishna Kumar Singh , Yu-Xiong Wang , Martial Hebert

Low-rank adaptation (LoRA) is a fine-tuning technique that can be applied to conditional generative diffusion models. LoRA utilizes a small number of context examples to adapt the model to a specific domain, character, style, or concept.…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Artur Kasymov , Marcin Sendera , Michał Stypułkowski , Maciej Zięba , Przemysław Spurek

Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, can generate visuals with a high degree of consistency. However, such fine-tuned models are not robust; they often fail to compose with concepts of pretrained…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Kyungmin Lee , Sangkyung Kwak , Kihyuk Sohn , Jinwoo Shin

The popularization of Text-to-Image (T2I) diffusion models enables the generation of high-quality images from text descriptions. However, generating diverse customized images with reference visual attributes remains challenging. This work…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Brian Nlong Zhao , Yuhang Xiao , Jiashu Xu , Xinyang Jiang , Yifan Yang , Dongsheng Li , Laurent Itti , Vibhav Vineet , Yunhao Ge

Text-to-image (T2I) generation using diffusion models has become a blockbuster service in today's AI cloud. A production T2I service typically involves a serving workflow where a base diffusion model is augmented with various "add-on"…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-09 Suyi Li , Lingyun Yang , Xiaoxiao Jiang , Hanfeng Lu , Dakai An , Zhipeng Di , Weiyi Lu , Jiawei Chen , Kan Liu , Yinghao Yu , Tao Lan , Guodong Yang , Lin Qu , Liping Zhang , Wei Wang

Diffusion models have proven to be highly effective in generating high-quality images. However, adapting large pre-trained diffusion models to new domains remains an open challenge, which is critical for real-world applications. This paper…

Computer Vision and Pattern Recognition · Computer Science 2023-07-28 Enze Xie , Lewei Yao , Han Shi , Zhili Liu , Daquan Zhou , Zhaoqiang Liu , Jiawei Li , Zhenguo Li

Recent personalization methods for diffusion models, such as Dreambooth and LoRA, allow fine-tuning pre-trained models to generate new concepts. However, applying these techniques across consecutive tasks in order to include, e.g., new…

Machine Learning · Computer Science 2025-03-14 Łukasz Staniszewski , Katarzyna Zaleska , Kamil Deja

In recent years, the development of diffusion models has led to significant progress in image and video generation tasks, with pre-trained models like the Stable Diffusion series playing a crucial role. Inspired by model pruning which…

Computer Vision and Pattern Recognition · Computer Science 2025-04-03 Teng Hu , Jiangning Zhang , Ran Yi , Hongrui Huang , Yabiao Wang , Lizhuang Ma

While Low-Rank Adaptation (LoRA) has proven beneficial for efficiently fine-tuning large models, LoRA fine-tuned text-to-image diffusion models lack diversity in the generated images, as the model tends to copy data from the observed…

‹ Prev 1 2 3 10 Next ›