English
Related papers

Related papers: Squeezing Large-Scale Diffusion Models for Mobile

200 papers

The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Yang Zhao , Yanwu Xu , Zhisheng Xiao , Haolin Jia , Tingbo Hou

Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Yanyu Li , Huan Wang , Qing Jin , Ju Hu , Pavlo Chemerys , Yun Fu , Yanzhi Wang , Sergey Tulyakov , Jian Ren

The rapid development and application of foundation models have revolutionized the field of artificial intelligence. Large diffusion models have gained significant attention for their ability to generate photorealistic images and support…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Yu-Hui Chen , Raman Sarokin , Juhyun Lee , Jiuqiang Tang , Chuo-Ling Chang , Andrei Kulik , Matthias Grundmann

GUI (graphical user interface) prototyping is a widely-used technique in requirements engineering for gathering and refining requirements, reducing development risks and increasing stakeholder engagement. However, GUI prototyping can be a…

Software Engineering · Computer Science 2023-10-05 Jialiang Wei , Anne-Lise Courbis , Thomas Lambolais , Binbin Xu , Pierre Louis Bernard , Gérard Dray

The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent research focuses on methods to reduce sampling steps, such…

We present Stable Video Diffusion - a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained for 2D image synthesis have been turned into…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Andreas Blattmann , Tim Dockhorn , Sumith Kulal , Daniel Mendelevitch , Maciej Kilian , Dominik Lorenz , Yam Levi , Zion English , Vikram Voleti , Adam Letts , Varun Jampani , Robin Rombach

Discrete diffusion models have emerged as a powerful generative modeling framework for discrete data with successful applications spanning from text generation to image synthesis. However, their deployment faces challenges due to the high…

Machine Learning · Computer Science 2025-12-01 Yinuo Ren , Haoxuan Chen , Yuchen Zhu , Wei Guo , Yongxin Chen , Grant M. Rotskoff , Molei Tao , Lexing Ying

The Diffusion Transformer (DiT) architecture is the state-of-the-art paradigm for high-fidelity image generation, underpinning models like Stable Diffusion-3 and FLUX.1. However, deploying these models on resource-constrained mobile devices…

Computer Vision and Pattern Recognition · Computer Science 2026-05-18 Kunpeng Du , Haizhen Xie , Sen Lu , Lei Yu , Binglei Bao , Huaao Tang , Chuntao Liu , Hao Wu , Yang Zhao , Zhicai Huang , Heyuan Gao , Zhijun Tu , Jie Hu , Xinghao Chen

Deploying large language models (LLMs) on mobile devices is an emerging trend to enable data privacy and offline accessibility of LLM applications. Modern mobile neural processing units (NPUs) make such deployment increasingly feasible.…

Operating Systems · Computer Science 2026-04-13 Yongsheng Yan , Jiacheng Shen , Xuchuan Luo , Yangfan Zhou

Video diffusion models have achieved impressive realism and controllability but are limited by high computational demands, restricting their use on mobile devices. This paper introduces the first mobile-optimized video diffusion model.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Haitam Ben Yahia , Denis Korzhenkov , Ioannis Lelekas , Amir Ghodrati , Amirhossein Habibian

Recently, there has been significant progress in the development of large models. Following the success of ChatGPT, numerous language models have been introduced, demonstrating remarkable performance. Similar advancements have also been…

Computer Vision and Pattern Recognition · Computer Science 2023-08-28 Tianyi Zhang , Zheng Wang , Jing Huang , Mohiuddin Muhammad Tasnim , Wei Shi

The Distribution Matching Distillation (DMD) has been successfully applied to text-to-image diffusion models such as Stable Diffusion (SD) 1.5. However, vanilla DMD suffers from convergence difficulties on large-scale flow-based…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Xingtong Ge , Xin Zhang , Tongda Xu , Yi Zhang , Xinjie Zhang , Yan Wang , Jun Zhang

Stable Diffusion Models (SDMs) have shown remarkable proficiency in image synthesis. However, their broad application is impeded by their large model sizes and intensive computational requirements, which typically require expensive cloud…

Computer Vision and Pattern Recognition · Computer Science 2024-10-31 Chenqian Yan , Songwei Liu , Hongjian Liu , Xurui Peng , Xiaojian Wang , Fangmin Chen , Lean Fu , Xing Mei

Text-to-image generation via Stable Diffusion models (SDM) have demonstrated remarkable capabilities. However, their computational intensity, particularly in the iterative denoising process, hinders real-time deployment in latency-sensitive…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Shuaiting Li , Juncan Deng , Zeyu Wang , Kedong Xu , Rongtao Deng , Hong Gu , Haibin Shen , Kejie Huang

Recent advances in diffusion transformers (DiTs) have set new standards in image generation, yet remain impractical for on-device deployment due to their high computational and memory costs. In this work, we present an efficient DiT…

Stable Diffusion is a popular Transformer-based model for image generation from text; it applies an image information creator to the input text and the visual knowledge is added in a step-by-step fashion to create an image that corresponds…

Image and Video Processing · Electrical Eng. & Systems 2024-04-02 Zhen Gao , Lini Yuan , Pedro Reviriego , Shanshan Liu , Fabrizio Lombardi

Data attribution for text-to-image models aims to identify the training images that most significantly influenced a generated output. Existing attribution methods involve considerable computational resources for each query, making them…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Sheng-Yu Wang , Aaron Hertzmann , Alexei A Efros , Richard Zhang , Jun-Yan Zhu

Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on…

Machine Learning · Computer Science 2025-04-18 Kafeng Wang , Jianfei Chen , He Li , Zhenpeng Mi , Jun Zhu

Latent diffusion models have become the popular choice for scaling up diffusion models for high resolution image synthesis. Compared to pixel-space models that are trained end-to-end, latent models are perceived to be more efficient and to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Emiel Hoogeboom , Thomas Mensink , Jonathan Heek , Kay Lamerigts , Ruiqi Gao , Tim Salimans

Stable diffusion models have ushered in a new era of advancements in image generation, currently reigning as the state-of-the-art approach, exhibiting unparalleled performance. The process of diffusion, accompanied by denoising through…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Andras Horvath
‹ Prev 1 2 3 10 Next ›