Related papers: A Machine Learning Perspective on Predictive Codin…

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ for efficient deep learning inference on resource-constrained hardware. Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner. To…

Machine Learning · Computer Science 2020-06-16 Tianzhe Wang , Kuan Wang , Han Cai , Ji Lin , Zhijian Liu , Song Han

An Introduction to Neural Data Compression

Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression…

Machine Learning · Computer Science 2023-08-22 Yibo Yang , Stephan Mandt , Lucas Theis

A Comprehensive Survey of Compression Algorithms for Language Models

How can we compress language models without sacrificing accuracy? The number of compression algorithms for language models is rapidly growing to benefit from remarkable advances of recent language models without side effects due to the…

Computation and Language · Computer Science 2024-01-30 Seungcheol Park , Jaehyeon Choi , Sojin Lee , U Kang

TuPAQ: An Efficient Planner for Large-scale Predictive Analytic Queries

The proliferation of massive datasets combined with the development of sophisticated analytical techniques have enabled a wide variety of novel applications such as improved product recommendations, automatic image tagging, and improved…

Databases · Computer Science 2015-03-10 Evan R. Sparks , Ameet Talwalkar , Michael J. Franklin , Michael I. Jordan , Tim Kraska

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

FP8 training has emerged as a promising method for improving training efficiency. Existing frameworks accelerate training by applying FP8 computation to linear layers while leaving optimizer states and activations in higher precision, which…

Machine Learning · Computer Science 2025-02-14 Haocheng Xi , Han Cai , Ligeng Zhu , Yao Lu , Kurt Keutzer , Jianfei Chen , Song Han

Synchronizing Probabilities in Model-Driven Lossless Compression

It is well-known in the field of lossless data compression that probabilistic next-symbol prediction can be used to compress sequences of symbols. Deep neural networks are able to capture rich dependencies in data, offering a powerful means…

Information Theory · Computer Science 2026-03-10 Aviv Adler , Jennifer Tang

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization

Post-training quantization (PTQ) has emerged as a practical approach to compress large neural networks, making them highly efficient for deployment. However, effectively reducing these models to their low-bit counterparts without…

Machine Learning · Computer Science 2024-10-22 Aozhong Zhang , Zi Yang , Naigang Wang , Yingyong Qi , Jack Xin , Xin Li , Penghang Yin

Best Subset Selection: Statistical Computing Meets Quantum Computing

With the rapid development of quantum computers, quantum algorithms have been studied extensively. However, quantum algorithms tackling statistical problems are still lacking. In this paper, we propose a novel non-oracular quantum adaptive…

Methodology · Statistics 2021-07-20 Wenxuan Zhong , Yuan Ke , Ye Wang , Yongkai Chen , Jinyang Chen , Ping Ma

High-Ratio Compression for Machine-Generated Data

Machine-generated data is rapidly growing and poses challenges for data-intensive systems, especially as the growth of data outpaces the growth of storage space. To cope with the storage issue, compression plays a critical role in storage…

Databases · Computer Science 2023-11-27 Jiujing Zhang , Zhitao Shen , Shiyu Yang , Lingkai Meng , Chuan Xiao , Wei Jia , Yue Li , Qinhui Sun , Wenjie Zhang , Xuemin Lin

BAQ: Efficient Bit Allocation Quantization for Large Language Models

Post-training model quantization is a widely adopted technique for reducing the memory and computational costs of large language models (LLMs). However, most existing methods rely on uniform or heuristic bitwidth assignments, failing to…

Machine Learning · Computer Science 2025-06-09 Chao Zhang , Li Wang , Samson Lasaulce , Merouane Debbah

Model Compression Techniques in Biometrics Applications: A Survey

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity,…

Computer Vision and Pattern Recognition · Computer Science 2024-01-19 Eduarda Caldeira , Pedro C. Neto , Marco Huber , Naser Damer , Ana F. Sequeira

Adaptive Dataset Quantization

Contemporary deep learning, characterized by the training of cumbersome neural networks on massive datasets, confronts substantial computational hurdles. To alleviate heavy data storage burdens on limited hardware resources, numerous…

Computer Vision and Pattern Recognition · Computer Science 2024-12-24 Muquan Li , Dongyang Zhang , Qiang Dong , Xiurui Xie , Ke Qin

Information-Theoretic Limits of Quantum Learning via Data Compression

Understanding the power of quantum data in machine learning is central to many proposed applications of quantum technologies. While access to quantum data can offer exponential advantages for carefully designed learning tasks and often…

Quantum Physics · Physics 2026-02-24 Armando Angrisani , Brian Coyle , Elham Kashefi

Vector-Quantized Autoregressive Predictive Coding

Autoregressive Predictive Coding (APC), as a self-supervised objective, has enjoyed success in learning representations from large amounts of unlabeled data, and the learned representations are rich for many downstream tasks. However, the…

Audio and Speech Processing · Electrical Eng. & Systems 2020-05-19 Yu-An Chung , Hao Tang , James Glass

PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them

Open-domain Question Answering models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compared to conventional models which retrieve…

Computation and Language · Computer Science 2021-02-16 Patrick Lewis , Yuxiang Wu , Linqing Liu , Pasquale Minervini , Heinrich Küttler , Aleksandra Piktus , Pontus Stenetorp , Sebastian Riedel

Prediction and Reference Quality Adaptation for Learned Video Compression

Temporal prediction is one of the most important technologies for video compression. Various prediction coding modes are designed in traditional video codecs. Traditional video codecs will adaptively to decide the optimal coding mode…

Image and Video Processing · Electrical Eng. & Systems 2025-03-25 Xihua Sheng , Li Li , Dong Liu , Houqiang Li

Seq2Seq2Seq: Lossless Data Compression via Discrete Latent Transformers and Reinforcement Learning

Efficient lossless compression is essential for minimizing storage costs and transmission overhead while preserving data integrity. Traditional compression techniques, such as dictionary-based and statistical methods, often struggle to…

Artificial Intelligence · Computer Science 2026-02-13 Mahdi Khodabandeh , Ghazal Shabani , Arash Yousefi Jordehi , Seyed Abolghasem Mirroshandel

Pruning Deep Neural Networks from a Sparsity Perspective

In recent years, deep network pruning has attracted significant attention in order to enable the rapid deployment of AI into small devices with computation and memory constraints. Pruning is often achieved by dropping redundant weights,…

Machine Learning · Computer Science 2023-08-24 Enmao Diao , Ganghua Wang , Jiawei Zhan , Yuhong Yang , Jie Ding , Vahid Tarokh

PAXQA: Generating Cross-lingual Question Answering Examples at Training Scale

Existing question answering (QA) systems owe much of their success to large, high-quality training data. Such annotation efforts are costly, and the difficulty compounds in the cross-lingual setting. Therefore, prior cross-lingual QA work…

Computation and Language · Computer Science 2023-10-18 Bryan Li , Chris Callison-Burch

PQD: Post-training Quantization for Efficient Diffusion Models

Diffusionmodels(DMs)havedemonstratedremarkableachievements in synthesizing images of high fidelity and diversity. However, the extensive computational requirements and slow generative speed of diffusion models have limited their widespread…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jiaojiao Ye , Zhen Wang , Linnan Jiang