Related papers: More Than Bits: Multi-Envelope Double Binary Facto…

Addition is almost all you need: Compressing large language models with double binary factorization

Binary quantization approaches, which replace weight matrices with binary matrices and substitute costly multiplications with cheaper additions, offer a computationally efficient approach to address the increasing computational and storage…

Machine Learning · Computer Science 2026-03-03 Vladimír Boža , Vladimír Macko

Fast And Efficient Boolean Matrix Factorization By Geometric Segmentation

Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an…

Machine Learning · Computer Science 2020-02-12 Changlin Wan , Wennan Chang , Tong Zhao , Mengya Li , Sha Cao , Chi Zhang

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

Large language models (LLMs) have significantly advanced the field of natural language processing, while the expensive memory and computation consumption impede their practical deployment. Quantization emerges as one of the most effective…

Machine Learning · Computer Science 2024-02-20 Hong Chen , Chengtao Lv , Liang Ding , Haotong Qin , Xiabin Zhou , Yifu Ding , Xuebo Liu , Min Zhang , Jinyang Guo , Xianglong Liu , Dacheng Tao

Discrete Factorization Machines for Fast Feature-based Recommendation

User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e.g., usually larger than 10^7, results in expensive storage and computational cost. This prohibits fast…

Information Retrieval · Computer Science 2018-09-20 Han Liu , Xiangnan He , Fuli Feng , Liqiang Nie , Rui Liu , Hanwang Zhang

Memory-Efficient Factorization Machines via Binarizing both Data and Model Coefficients

Factorization Machines (FM), a general predictor that can efficiently model feature interactions in linear time, was primarily proposed for collaborative recommendation and have been broadly used for regression, classification and ranking…

Machine Learning · Computer Science 2021-08-18 Yu Geng , Liang Lan

Asymmetric Multiresolution Matrix Factorization

Multiresolution Matrix Factorization (MMF) was recently introduced as an alternative to the dominant low-rank paradigm in order to capture structure in matrices at multiple different scales. Using ideas from multiresolution analysis (MRA),…

Numerical Analysis · Mathematics 2019-10-14 Pramod Kaushik Mudrakarta , Shubhendu Trivedi , Risi Kondor

Efficient Layered New Bit-Flipping QC-MDPC Decoder for BIKE Post-Quantum Cryptography

The medium-density parity-check (MDPC) code-based Bit Flipping Key Encapsulation (BIKE) mechanism remains a candidate of post-quantum cryptography standardization. The latest version utilizes a new bit-flipping (BF) decoding algorithm,…

Cryptography and Security · Computer Science 2024-12-17 Jiaxuan Cai , Xinmiao Zhang

LittleBit: Ultra Low-Bit Quantization via Latent Factorization

The deployment of large language models (LLMs) is frequently hindered by prohibitive memory and computational requirements. While quantization mitigates these bottlenecks, maintaining model fidelity in the sub-1-bit regime remains a…

Machine Learning · Computer Science 2026-02-06 Banseok Lee , Dongkyu Kim , Youngcheon You , Youngmin Kim

An Improved WBF Algorithm for Higher-Speed Decoding of LDPC Codes

Due to the speed limitation of the conventional bit-chosen strategy in the existing weighted bit flipping algorithms, a high-speed LDPC decoder cannot be realized. To solve this problem, we propose a fast weighted bit flipping (FWBF)…

Information Theory · Computer Science 2012-06-18 Kexiang Ma , Yongzhao Li , Caizhi Zhu , Hailin Zhang , Feng Qi

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation

Deploying large language models (LLMs) in resource-constrained environments is hindered by heavy computational and memory requirements. We present LBLLM, a lightweight binarization framework that achieves effective W(1+1)A4 quantization…

Machine Learning · Computer Science 2026-04-22 Siqing Song , Chuang Wang , Yong Lang , Yi Yang , Xu-Yao Zhang

Learning Multiresolution Matrix Factorization and its Wavelet Networks on Graphs

Multiresolution Matrix Factorization (MMF) is unusual amongst fast matrix factorization algorithms in that it does not make a low rank assumption. This makes MMF especially well suited to modeling certain types of graphs with complex…

Machine Learning · Computer Science 2021-11-04 Truong Son Hy , Risi Kondor

Binary Matrix Factorisation and Completion via Integer Programming

Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary…

Optimization and Control · Mathematics 2021-08-05 Reka A. Kovacs , Oktay Gunluk , Raphael A. Hauser

Collective Bit Flipping-Based Decoding of Quantum LDPC Codes

Quantum low-density parity-check (QLDPC) codes have been proven to achieve higher minimum distances at higher code rates than surface codes. However, this family of codes imposes stringent latency requirements and poor performance under…

Information Theory · Computer Science 2024-06-26 Dimitris Chytas , Nithin Raveendran , Bane Vasić

BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models

With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs. However, the highly discrete representation…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Xingyu Zheng , Xianglong Liu , Haotong Qin , Xudong Ma , Mingyuan Zhang , Haojie Hao , Jiakai Wang , Zixiang Zhao , Jinyang Guo , Michele Magno

Calibration and Transformation-Free Weight-Only LLMs Quantization via Dynamic Grouping

Large Language Models (LLMs) deliver strong performance but are difficult to deploy under tight memory and compute constraints. Low-bit post-training quantization (PTQ) is a promising direction; however, it typically relies on calibration…

Machine Learning · Computer Science 2026-02-09 Xinzhe Zheng , Zhen-Qun Yang , Zishan Liu , Haoran Xie , S. Joe Qin , Arlene Chen , Fangzhen Lin

Learning Based Hybrid Beamforming for Millimeter Wave Multi-User MIMO Systems

Hybrid beamforming (HBF) design is a crucial stage in millimeter wave (mmWave) multi-user multi-input multi-output (MU-MIMO) systems. However, conventional HBF methods are still with high complexity and strongly rely on the quality of…

Signal Processing · Electrical Eng. & Systems 2020-04-28 Shaocheng Huang , Yu Ye , Ming Xiao

Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs

Large language models (LLMs) have become pivotal in artificial intelligence, demonstrating strong capabilities in reasoning, understanding, and generating data. However, their deployment on edge devices is hindered by their substantial…

Machine Learning · Computer Science 2025-05-14 Lucas Maisonnave , Cyril Moineau , Olivier Bichler , Fabrice Rastello

Two-Bit Bit Flipping Decoding of LDPC Codes

In this paper, we propose a new class of bit flipping algorithms for low-density parity-check (LDPC) codes over the binary symmetric channel (BSC). Compared to the regular (parallel or serial) bit flipping algorithms, the proposed…

Information Theory · Computer Science 2016-11-17 Dung Viet Nguyen , Bane Vasic , Michael W. Marcellin

MX+: Pushing the Limits of Microscaling Formats for Efficient Large Language Model Serving

Reduced-precision data formats are crucial for cost-effective serving of large language models (LLMs). While numerous reduced-precision formats have been introduced thus far, they often require intrusive modifications to the software…

Machine Learning · Computer Science 2025-10-17 Jungi Lee , Junyong Park , Soohyun Cha , Jaehoon Cho , Jaewoong Sim

SFMP: Fine-Grained, Hardware-Friendly and Search-Free Mixed-Precision Quantization for Large Language Models

Mixed-precision quantization is a promising approach for compressing large language models under tight memory budgets. However, existing mixed-precision methods typically suffer from one of two limitations: they either rely on expensive…

Machine Learning · Computer Science 2026-02-03 Xin Nie , Haicheng Zhang , Liang Dong , Beining Feng , Jinhong Weng , Guiling Sun