English
Related papers

Related papers: SwiftDiffusion: Efficient Diffusion Model Serving …

200 papers

Recently, diffusion models have made remarkable progress in text-to-image (T2I) generation, synthesizing images with high fidelity and diverse contents. Despite this advancement, latent space smoothness within diffusion models remains…

Computer Vision and Pattern Recognition · Computer Science 2023-12-08 Jiayi Guo , Xingqian Xu , Yifan Pu , Zanlin Ni , Chaofei Wang , Manushree Vasu , Shiji Song , Gao Huang , Humphrey Shi

Diffusion models have emerged as the prevailing approach for text-to-image (T2I) and text-to-video (T2V) generation, yet production platforms must increasingly serve both modalities on shared GPU clusters while meeting stringent latency…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-10 Fanjiang Ye , Zhangke Li , Xinrui Zhong , Ethan Ma , Russell Chen , Kaijian Wang , Jingwei Zuo , Desen Sun , Ye Cao , Triston Cao , Myungjin Lee , Arvind Krishnamurthy , Yuke Wang

Artificial Intelligence-Generated Content (AIGC) has made significant strides, with high-resolution text-to-image (T2I) generation becoming increasingly critical for improving users' Quality of Experience (QoE). Although…

Computer Vision and Pattern Recognition · Computer Science 2026-01-22 Chongbin Yi , Yuxin Liang , Ziqi Zhou , Peng Yang

In this paper, we aim to enhance the performance of SwiftBrush, a prominent one-step text-to-image diffusion model, to be competitive with its multi-step Stable Diffusion counterpart. Initially, we explore the quality-diversity trade-off…

Computer Vision and Pattern Recognition · Computer Science 2024-08-28 Trung Dao , Thuan Hoang Nguyen , Thanh Le , Duc Vu , Khoi Nguyen , Cuong Pham , Anh Tran

As text-to-image (T2I) synthesis models increase in size, they demand higher inference costs due to the need for more expensive GPUs with larger memory, which makes it challenging to reproduce these models in addition to the restricted…

Computer Vision and Pattern Recognition · Computer Science 2024-11-26 Youngwan Lee , Kwanyong Park , Yoorhim Cho , Yong-Ju Lee , Sung Ju Hwang

Diffusion models have emerged as a dominant paradigm for generative modeling across a wide range of domains, including prompt-conditional generation. The vast majority of samplers, however, rely on forward discretization of the reverse…

Computer Vision and Pattern Recognition · Computer Science 2025-12-01 Zhenghan Fang , Jian Zheng , Qiaozi Gao , Xiaofeng Gao , Jeremias Sulam

Text-to-image (T2I) models are well known for their ability to produce highly realistic images, while multimodal large language models (MLLMs) are renowned for their proficiency in understanding and integrating multiple modalities. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jian Ma , Qirong Peng , Xu Guo , Chen Chen , Haonan Lu , Zhenyu Yang

Text-to-image (T2I) generative diffusion models have demonstrated outstanding performance in synthesizing diverse, high-quality visuals from text captions. Several layout-to-image models have been developed to control the generation process…

Computer Vision and Pattern Recognition · Computer Science 2025-02-11 Ahmad Süleyman , Göksel Biricik

Large-scale diffusion models have achieved state-of-the-art results on text-to-image synthesis (T2I) tasks. Despite their ability to generate high-quality yet creative images, we observe that attribution-binding and compositional…

Computer Vision and Pattern Recognition · Computer Science 2023-03-02 Weixi Feng , Xuehai He , Tsu-Jui Fu , Varun Jampani , Arjun Akula , Pradyumna Narayana , Sugato Basu , Xin Eric Wang , William Yang Wang

The Stable Diffusion Model (SDM) is a prevalent and effective model for text-to-image (T2I) and image-to-image (I2I) generation. Despite various attempts at sampler optimization, model distillation, and network quantification, these…

Computer Vision and Pattern Recognition · Computer Science 2024-06-18 Jinchao Zhu , Yuxuan Wang , Siyuan Pan , Pengfei Wan , Di Zhang , Gao Huang

Large diffusion-based Text-to-Image (T2I) models have shown impressive generative powers for text-to-image generation as well as spatially conditioned image generation. For most applications, we can train the model end-toend with paired…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Nithin Gopalakrishnan Nair , Jeya Maria Jose Valanarasu , Vishal M Patel

The most advanced text-to-image (T2I) models require significant training costs (e.g., millions of GPU hours), seriously hindering the fundamental innovation for the AIGC community while increasing CO2 emissions. This paper introduces…

Computer Vision and Pattern Recognition · Computer Science 2024-01-01 Junsong Chen , Jincheng Yu , Chongjian Ge , Lewei Yao , Enze Xie , Yue Wu , Zhongdao Wang , James Kwok , Ping Luo , Huchuan Lu , Zhenguo Li

The Diffusion Model (DM) has emerged as the SOTA approach for image synthesis. However, the existing DM cannot perform well on some image-to-image translation (I2I) tasks. Different from image synthesis, some I2I tasks, such as…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Bin Xia , Yulun Zhang , Shiyin Wang , Yitong Wang , Xinglong Wu , Yapeng Tian , Wenming Yang , Radu Timotfe , Luc Van Gool

The Stable Diffusion Model (SDM) is a popular and efficient text-to-image (t2i) generation and image-to-image (i2i) generation model. Although there have been some attempts to reduce sampling steps, model distillation, and network…

Computer Vision and Pattern Recognition · Computer Science 2024-03-06 Jinchao Zhu , Yuxuan Wang , Xiaobing Tu , Siyuan Pan , Pengfei Wan , Gao Huang

In layout-to-image (L2I) synthesis, controlled complex scenes are generated from coarse information like bounding boxes. Such a task is exciting to many downstream applications because the input layouts offer strong guidance to the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Ruyu Wang , Xuefeng Hou , Sabrina Schmedding , Marco F. Huber

The Text-to-Image (T2I) diffusion model has emerged as one of the most widely adopted generative models. However, serving diffusion models at the granularity of entire images introduces significant challenges, particularly under…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-10-07 Desen Sun , Zepeng Zhao , Yuke Wang

With the advance of text-to-image (T2I) diffusion models (e.g., Stable Diffusion) and corresponding personalization techniques such as DreamBooth and LoRA, everyone can manifest their imagination into high-quality images at an affordable…

Computer Vision and Pattern Recognition · Computer Science 2024-02-09 Yuwei Guo , Ceyuan Yang , Anyi Rao , Zhengyang Liang , Yaohui Wang , Yu Qiao , Maneesh Agrawala , Dahua Lin , Bo Dai

Recently, large-scale text-to-image (T2I) diffusion models have emerged as a powerful tool for image-to-image translation (I2I), allowing open-domain image translation via user-provided text prompts. This paper proposes frequency-controlled…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Xiang Gao , Zhengbo Xu , Junhan Zhao , Jiaying Liu

Text-to-Image (T2I) diffusion models have achieved remarkable success in image generation. Despite their progress, challenges remain in both prompt-following ability, image quality and lack of high-quality datasets, which are essential for…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jingkun An , Yinghao Zhu , Zongjian Li , Enshen Zhou , Haoran Feng , Xijie Huang , Bohua Chen , Yemin Shi , Chengwei Pan

The diffusion model has provided a strong tool for implementing text-to-image (T2I) and image-to-image (I2I) generation. Recently, topology and texture control are popular explorations, e.g., ControlNet, IP-Adapter, Ctrl-X, and DSG. These…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Jia Li , Nan Gao , Huaibo Huang , Ran He
‹ Prev 1 2 3 10 Next ›