Related papers: Squeezing Large-Scale Diffusion Models for Mobile

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

The deployment of large-scale text-to-image diffusion models on mobile devices is impeded by their substantial model size and slow inference speed. In this paper, we propose \textbf{MobileDiffusion}, a highly efficient text-to-image…

Computer Vision and Pattern Recognition · Computer Science 2024-06-13 Yang Zhao , Yanwu Xu , Zhisheng Xiao , Haolin Jia , Tingbo Hou

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Text-to-image diffusion models can create stunning images from natural language descriptions that rival the work of professional artists and photographers. However, these models are large, with complex network architectures and tens of…

Computer Vision and Pattern Recognition · Computer Science 2023-10-17 Yanyu Li , Huan Wang , Qing Jin , Ju Hu , Pavlo Chemerys , Yun Fu , Yanzhi Wang , Sergey Tulyakov , Jian Ren

Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations

The rapid development and application of foundation models have revolutionized the field of artificial intelligence. Large diffusion models have gained significant attention for their ability to generate photorealistic images and support…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Yu-Hui Chen , Raman Sarokin , Juhyun Lee , Jiuqiang Tang , Chuo-Ling Chang , Andrei Kulik , Matthias Grundmann

Boosting GUI Prototyping with Diffusion Models

GUI (graphical user interface) prototyping is a widely-used technique in requirements engineering for gathering and refining requirements, reducing development risks and increasing stakeholder engagement. However, GUI prototyping can be a…

Software Engineering · Computer Science 2023-10-05 Jialiang Wei , Anne-Lise Courbis , Thomas Lambolais , Binbin Xu , Pierre Louis Bernard , Gérard Dray

EdgeFusion: On-Device Text-to-Image Generation

The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application. To tackle this challenge, recent research focuses on methods to reduce sampling steps, such…

Machine Learning · Computer Science 2024-04-19 Thibault Castells , Hyoung-Kyu Song , Tairen Piao , Shinkook Choi , Bo-Kyeong Kim , Hanyoung Yim , Changgwun Lee , Jae Gon Kim , Tae-Ho Kim

Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets

We present Stable Video Diffusion - a latent video diffusion model for high-resolution, state-of-the-art text-to-video and image-to-video generation. Recently, latent diffusion models trained for 2D image synthesis have been turned into…

Computer Vision and Pattern Recognition · Computer Science 2023-11-28 Andreas Blattmann , Tim Dockhorn , Sumith Kulal , Daniel Mendelevitch , Maciej Kilian , Dominik Lorenz , Yam Levi , Zion English , Vikram Voleti , Adam Letts , Varun Jampani , Robin Rombach

Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms

Discrete diffusion models have emerged as a powerful generative modeling framework for discrete data with successful applications spanning from text generation to image synthesis. However, their deployment faces challenges due to the high…

Machine Learning · Computer Science 2025-12-01 Yinuo Ren , Haoxuan Chen , Yuchen Zhu , Wei Guo , Yongxin Chen , Grant M. Rotskoff , Molei Tao , Lexing Ying

ElasticDiT: Efficient Diffusion Transformers via Elastic Architecture and Sparse Attention for High-Resolution Image Generation on Mobile Devices

The Diffusion Transformer (DiT) architecture is the state-of-the-art paradigm for high-fidelity image generation, underpinning models like Stable Diffusion-3 and FLUX.1. However, deploying these models on resource-constrained mobile devices…

Computer Vision and Pattern Recognition · Computer Science 2026-05-18 Kunpeng Du , Haizhen Xie , Sen Lu , Lei Yu , Binglei Bao , Huaao Tang , Chuntao Liu , Hao Wu , Yang Zhao , Zhicai Huang , Heyuan Gao , Zhijun Tu , Jie Hu , Xinghao Chen

EdgeFlow: Fast Cold Starts for LLMs on Mobile Devices

Deploying large language models (LLMs) on mobile devices is an emerging trend to enable data privacy and offline accessibility of LLM applications. Modern mobile neural processing units (NPUs) make such deployment increasingly feasible.…

Operating Systems · Computer Science 2026-04-13 Yongsheng Yan , Jiacheng Shen , Xuchuan Luo , Yangfan Zhou

Mobile Video Diffusion

Video diffusion models have achieved impressive realism and controllability but are limited by high computational demands, restricting their use on mobile devices. This paper introduces the first mobile-optimized video diffusion model.…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Haitam Ben Yahia , Denis Korzhenkov , Ioannis Lelekas , Amir Ghodrati , Amirhossein Habibian

A Survey of Diffusion Based Image Generation Models: Issues and Their Solutions

Recently, there has been significant progress in the development of large models. Following the success of ChatGPT, numerous language models have been introduced, demonstrating remarkable performance. Similar advancements have also been…

Computer Vision and Pattern Recognition · Computer Science 2023-08-28 Tianyi Zhang , Zheng Wang , Jing Huang , Mohiuddin Muhammad Tasnim , Wei Shi

SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

The Distribution Matching Distillation (DMD) has been successfully applied to text-to-image diffusion models such as Stable Diffusion (SD) 1.5. However, vanilla DMD suffers from convergence difficulties on large-scale flow-based…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Xingtong Ge , Xin Zhang , Tongda Xu , Yi Zhang , Xinjie Zhang , Yan Wang , Jun Zhang

Hybrid SD: Edge-Cloud Collaborative Inference for Stable Diffusion Models

Stable Diffusion Models (SDMs) have shown remarkable proficiency in image synthesis. However, their broad application is impeded by their large model sizes and intensive computational requirements, which typically require expensive cloud…

Computer Vision and Pattern Recognition · Computer Science 2024-10-31 Chenqian Yan , Songwei Liu , Hongjian Liu , Xurui Peng , Xiaojian Wang , Fangmin Chen , Lean Fu , Xing Mei

Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion

Text-to-image generation via Stable Diffusion models (SDM) have demonstrated remarkable capabilities. However, their computational intensity, particularly in the iterative denoising process, hinders real-time deployment in latency-sensitive…

Computer Vision and Pattern Recognition · Computer Science 2025-05-08 Shuaiting Li , Juncan Deng , Zeyu Wang , Kedong Xu , Rongtao Deng , Hong Gu , Haibin Shen , Kejie Huang

SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices

Recent advances in diffusion transformers (DiTs) have set new standards in image generation, yet remain impractical for on-device deployment due to their high computational and memory costs. In this work, we present an efficient DiT…

Computer Vision and Pattern Recognition · Computer Science 2026-02-12 Dongting Hu , Aarush Gupta , Magzhan Gabidolla , Arpit Sahni , Huseyin Coskun , Yanyu Li , Yerlan Idelbayev , Ahsan Mahmood , Aleksei Lebedev , Dishani Lahiri , Anujraaj Goyal , Ju Hu , Mingming Gong , Sergey Tulyakov , Anil Kag

Dependability Evaluation of Stable Diffusion with Soft Errors on the Model Parameters

Stable Diffusion is a popular Transformer-based model for image generation from text; it applies an image information creator to the input text and the visual knowledge is added in a step-by-step fashion to create an image that corresponds…

Image and Video Processing · Electrical Eng. & Systems 2024-04-02 Zhen Gao , Lini Yuan , Pedro Reviriego , Shanshan Liu , Fabrizio Lombardi

Fast Data Attribution for Text-to-Image Models

Data attribution for text-to-image models aims to identify the training images that most significantly influenced a generated output. Existing attribution methods involve considerable computational resources for each query, making them…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Sheng-Yu Wang , Aaron Hertzmann , Alexei A Efros , Richard Zhang , Jun-Yan Zhu

SparseDM: Toward Sparse Efficient Diffusion Models

Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on…

Machine Learning · Computer Science 2025-04-18 Kafeng Wang , Jianfei Chen , He Li , Zhenpeng Mi , Jun Zhu

Simpler Diffusion (SiD2): 1.5 FID on ImageNet512 with pixel-space diffusion

Latent diffusion models have become the popular choice for scaling up diffusion models for high resolution image synthesis. Compared to pixel-space models that are trained end-to-end, latent models are perceived to be more efficient and to…

Computer Vision and Pattern Recognition · Computer Science 2025-03-25 Emiel Hoogeboom , Thomas Mensink , Jonathan Heek , Kay Lamerigts , Ruiqi Gao , Tim Salimans

Stable Diffusion with Continuous-time Neural Network

Stable diffusion models have ushered in a new era of advancements in image generation, currently reigning as the state-of-the-art approach, exhibiting unparalleled performance. The process of diffusion, accompanied by denoising through…

Computer Vision and Pattern Recognition · Computer Science 2024-10-29 Andras Horvath