Related papers: BinaryDM: Accurate Weight Binarization for Efficie…

BiDM: Pushing the Limit of Quantization for Diffusion Models

Diffusion models (DMs) have been significantly developed and widely used in various applications due to their excellent generative qualities. However, the expensive computation and massive parameters of DMs hinder their practical use in…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Xingyu Zheng , Xianglong Liu , Yichen Bian , Xudong Ma , Yulun Zhang , Jiakai Wang , Jinyang Guo , Haotong Qin

Binarized Diffusion Model for Image Super-Resolution

Advanced diffusion models (DMs) perform impressively in image super-resolution (SR), but the high memory and computational costs hinder their deployment. Binarization, an ultra-compression algorithm, offers the potential for effectively…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Zheng Chen , Haotong Qin , Yong Guo , Xiongfei Su , Xin Yuan , Linghe Kong , Yulun Zhang

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

Diffusion models have demonstrated remarkable capabilities in image synthesis and related generative tasks. Nevertheless, their practicality for real-world applications is constrained by substantial computational costs and latency issues.…

Computer Vision and Pattern Recognition · Computer Science 2024-04-16 Yefei He , Jing Liu , Weijia Wu , Hong Zhou , Bohan Zhuang

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

Large language models (LLMs) have significantly advanced the field of natural language processing, while the expensive memory and computation consumption impede their practical deployment. Quantization emerges as one of the most effective…

Machine Learning · Computer Science 2024-02-20 Hong Chen , Chengtao Lv , Liang Ding , Haotong Qin , Xiabin Zhou , Yifu Ding , Xuebo Liu , Min Zhang , Jinyang Guo , Xianglong Liu , Dacheng Tao

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that…

Computer Vision and Pattern Recognition · Computer Science 2024-12-17 Weilun Feng , Haotong Qin , Chuanguang Yang , Zhulin An , Libo Huang , Boyu Diao , Fei Wang , Renshuai Tao , Yongjun Xu , Michele Magno

Bimodal Distributed Binarized Neural Networks

Binary Neural Networks (BNNs) are an extremely promising method to reduce deep neural networks' complexity and power consumption massively. Binarization techniques, however, suffer from ineligible performance degradation compared to their…

Machine Learning · Computer Science 2022-04-06 Tal Rozen , Moshe Kimhi , Brian Chmiel , Avi Mendelson , Chaim Baskin

BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution

While super-resolution (SR) methods based on diffusion models (DM) have demonstrated inspiring performance, their deployment is impeded due to the heavy request of memory and computation. Recent researchers apply two kinds of methods to…

Computer Vision and Pattern Recognition · Computer Science 2025-02-05 Kai Liu , Kaicheng Yang , Zheng Chen , Zhiteng Li , Yong Guo , Wenbo Li , Linghe Kong , Yulun Zhang

Binary Diffusion Probabilistic Model

We propose the Binary Diffusion Probabilistic Model (BDPM), a generative framework specifically designed for data representations in binary form. Conventional denoising diffusion probabilistic models (DDPMs) assume continuous inputs, use…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Vitaliy Kinakh , Slava Voloshynovskiy

Distribution-sensitive Information Retention for Accurate Binary Neural Network

Model binarization is an effective method of compressing neural networks and accelerating their inference process. However, a significant performance gap still exists between the 1-bit model and the 32-bit one. The empirical study shows…

Computer Vision and Pattern Recognition · Computer Science 2022-09-26 Haotong Qin , Xiangguo Zhang , Ruihao Gong , Yifu Ding , Yi Xu , Xianglong Liu

Hybrid Binary Networks: Optimizing for Accuracy, Efficiency and Memory

Binarization is an extreme network compression approach that provides large computational speedups along with energy and memory savings, albeit at significant accuracy costs. We investigate the question of where to binarize inputs at…

Computer Vision and Pattern Recognition · Computer Science 2018-04-12 Ameya Prabhu , Vishal Batchu , Rohit Gajawada , Sri Aurobindo Munagala , Anoop Namboodiri

Low-Bitwidth Floating Point Quantization for Efficient High-Quality Diffusion Models

Diffusion models are emerging models that generate images by iteratively denoising random Gaussian noise using deep neural networks. These models typically exhibit high computational and memory demands, necessitating effective post-training…

Computer Vision and Pattern Recognition · Computer Science 2024-08-14 Cheng Chen , Christina Giannoula , Andreas Moshovos

Diffusion Model Based Signal Recovery Under 1-Bit Quantization

Diffusion models (DMs) have demonstrated to be powerful priors for signal recovery, but their application to 1-bit quantization tasks, such as 1-bit compressed sensing and logistic regression, remains a challenge. This difficulty stems from…

Machine Learning · Computer Science 2026-01-13 Youming Chen , Zhaoqiang Liu

Addition is almost all you need: Compressing large language models with double binary factorization

Binary quantization approaches, which replace weight matrices with binary matrices and substitute costly multiplications with cheaper additions, offer a computationally efficient approach to address the increasing computational and storage…

Machine Learning · Computer Science 2026-03-03 Vladimír Boža , Vladimír Macko

Exact Backpropagation in Binary Weighted Networks with Group Weight Transformations

Quantization based model compression serves as high performing and fast approach for inference that yields models which are highly compressed when compared to their full-precision floating point counterparts. The most extreme quantization…

Machine Learning · Computer Science 2021-11-09 Yaniv Shulman

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation

Deploying large language models (LLMs) in resource-constrained environments is hindered by heavy computational and memory requirements. We present LBLLM, a lightweight binarization framework that achieves effective W(1+1)A4 quantization…

Machine Learning · Computer Science 2026-04-22 Siqing Song , Chuang Wang , Yong Lang , Yi Yang , Xu-Yao Zhang

ARB-LLM: Alternating Refined Binarizations for Large Language Models

Large Language Models (LLMs) have greatly pushed forward advancements in natural language processing, yet their high memory and computational demands hinder practical deployment. Binarization, as an effective compression technique, can…

Computer Vision and Pattern Recognition · Computer Science 2026-02-02 Zhiteng Li , Xianglong Yan , Tianao Zhang , Haotong Qin , Dong Xie , Jiang Tian , zhongchao shi , Linghe Kong , Yulun Zhang , Xiaokang Yang

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

Deep neural networks, such as the Deep-FSMN, have been widely studied for keyword spotting (KWS) applications while suffering expensive computation and storage. Therefore, network compression technologies like binarization are studied to…

Computation and Language · Computer Science 2023-02-07 Haotong Qin , Xudong Ma , Yifu Ding , Xiaoyang Li , Yang Zhang , Zejun Ma , Jiakai Wang , Jie Luo , Xianglong Liu

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

The practical deployment of diffusion models is still hindered by the high memory and computational overhead. Although quantization paves a way for model compression and acceleration, existing methods face challenges in achieving low-bit…

Computer Vision and Pattern Recognition · Computer Science 2025-07-16 Haoxuan Wang , Yuzhang Shang , Zhihang Yuan , Junyi Wu , Junchi Yan , Yan Yan

STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs

In this paper, we present the first structural binarization method for LLM compression to less than 1-bit precision. Although LLMs have achieved remarkable performance, their memory-bound nature during the inference stage hinders the…

Machine Learning · Computer Science 2024-10-10 Peijie Dong , Lujun Li , Yuedong Zhong , Dayou Du , Ruibo Fan , Yuhan Chen , Zhenheng Tang , Qiang Wang , Wei Xue , Yike Guo , Xiaowen Chu

Distribution-Aware Binarization of Neural Networks for Sketch Recognition

Deep neural networks are highly effective at a range of computational tasks. However, they tend to be computationally expensive, especially in vision-related problems, and also have large memory requirements. One of the most effective…

Computer Vision and Pattern Recognition · Computer Science 2018-04-10 Ameya Prabhu , Vishal Batchu , Sri Aurobindo Munagala , Rohit Gajawada , Anoop Namboodiri