Related papers: Robust Model Compression Using Deep Hypotheses

Robustness in Compressed Neural Networks for Object Detection

Model compression techniques allow to significantly reduce the computational cost associated with data processing by deep neural networks with only a minor decrease in average accuracy. Simultaneously, reducing the model size may have a…

Machine Learning · Computer Science 2021-09-28 Sebastian Cygert , Andrzej Czyżewski

To Compress, or Not to Compress: Characterizing Deep Learning Model Compression for Embedded Inference

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address…

Machine Learning · Computer Science 2018-10-23 Qing Qin , Jie Ren , Jialong Yu , Ling Gao , Hai Wang , Jie Zheng , Yansong Feng , Jianbin Fang , Zheng Wang

Mixed-Precision Embeddings for Large-Scale Recommendation Models

Embedding techniques have become essential components of large databases in the deep learning era. By encoding discrete entities, such as words, items, or graph nodes, into continuous vector spaces, embeddings facilitate more efficient…

Information Retrieval · Computer Science 2024-10-18 Shiwei Li , Zhuoqi Hu , Xing Tang , Haozhao Wang , Shijie Xu , Weihong Luo , Yuhua Li , Xiuqiang He , Ruixuan Li

Robust Compressed Sensing using Generative Models

The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume…

Machine Learning · Statistics 2021-06-24 Ajil Jalal , Liu Liu , Alexandros G. Dimakis , Constantine Caramanis

Model Preserving Compression for Neural Networks

After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands. When compressing, it is desirable to preserve the original model's per-example decisions (e.g., to go beyond top-1…

Machine Learning · Computer Science 2022-10-18 Jerry Chee , Megan Renz , Anil Damle , Christopher De Sa

Efficient Compression of Overparameterized Deep Models through Low-Dimensional Learning Dynamics

Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive…

Machine Learning · Computer Science 2024-03-13 Soo Min Kwon , Zekai Zhang , Dogyoon Song , Laura Balzano , Qing Qu

The Knowledge Within: Methods for Data-Free Model Compression

Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a…

Machine Learning · Computer Science 2020-04-08 Matan Haroush , Itay Hubara , Elad Hoffer , Daniel Soudry

Model compression as constrained optimization, with application to neural nets. Part I: general framework

Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model…

Machine Learning · Computer Science 2017-07-06 Miguel Á. Carreira-Perpiñán

Spectral Pruning: Compressing Deep Neural Networks via Spectral Analysis and its Generalization Error

Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for…

Machine Learning · Statistics 2020-07-14 Taiji Suzuki , Hiroshi Abe , Tomoya Murata , Shingo Horiuchi , Kotaro Ito , Tokuma Wachi , So Hirai , Masatoshi Yukishima , Tomoaki Nishimura

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

Approach to Finding a Robust Deep Learning Model

The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models. This growing demand highlights the importance of training models without human supervision, while…

Machine Learning · Computer Science 2025-05-26 Alexey Boldyrev , Fedor Ratnikov , Andrey Shevelev

Algorithmic Simplification of Neural Networks with Mosaic-of-Motifs

Large-scale deep learning models are well-suited for compression. Across a variety of tasks, methods like pruning, quantization, and knowledge distillation have been used to achieve massive reductions in model parameters with only marginal…

Machine Learning · Computer Science 2026-05-18 Pedram Bakhtiarifard , Tong Chen , Jonathan Wenshøj , Erik B Dam , Raghavendra Selvan

Structured Multi-Hashing for Model Compression

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this…

Machine Learning · Computer Science 2019-11-27 Elad Eban , Yair Movshovitz-Attias , Hao Wu , Mark Sandler , Andrew Poon , Yerlan Idelbayev , Miguel A. Carreira-Perpinan

Rate Distortion For Model Compression: From Theory To Practice

The enormous size of modern deep neural networks makes it challenging to deploy those models in memory and communication limited scenarios. Thus, compressing a trained model without a significant loss in performance has become an…

Information Theory · Computer Science 2019-01-25 Weihao Gao , Yu-Han Liu , Chong Wang , Sewoong Oh

A Benchmark Study of Neural Network Compression Methods for Hyperspectral Image Classification

Deep neural networks have achieved strong performance in image classification tasks due to their ability to learn complex patterns from high-dimensional data. However, their large computational and memory requirements often limit deployment…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Sai Shi

Efficient Model Compression for Bayesian Neural Networks

Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories,…

Machine Learning · Computer Science 2024-11-04 Diptarka Saha , Zihe Liu , Feng Liang

Data-Independent Structured Pruning of Neural Networks via Coresets

Model compression is crucial for deployment of neural networks on devices with limited computational and memory resources. Many different methods show comparable accuracy of the compressed model and similar compression rates. However, the…

Machine Learning · Computer Science 2020-08-21 Ben Mussay , Daniel Feldman , Samson Zhou , Vladimir Braverman , Margarita Osadchy

Input Resolution Downsizing as a Compression Technique for Vision Deep Learning Systems

Model compression is a critical area of research in deep learning, in particular in vision, driven by the need to lighten models memory or computational footprints. While numerous methods for model compression have been proposed, most focus…

Machine Learning · Computer Science 2025-04-08 Jeremy Morlier , Mathieu Leonardon , Vincent Gripon

Model compression as constrained optimization, with application to neural nets. Part V: combining compressions

Model compression is generally performed by using quantization, low-rank approximation or pruning, for which various algorithms have been researched in recent years. One fundamental question is: what types of compression work better for a…

Machine Learning · Computer Science 2021-07-12 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

Forget the Data and Fine-Tuning! Just Fold the Network to Compress

We introduce model folding, a novel data-free model compression technique that merges structurally similar neurons across layers, significantly reducing the model size without the need for fine-tuning or access to training data. Unlike…

Machine Learning · Computer Science 2025-08-13 Dong Wang , Haris Šikić , Lothar Thiele , Olga Saukh