English
Related papers

Related papers: Robust Model Compression Using Deep Hypotheses

200 papers

Model compression techniques allow to significantly reduce the computational cost associated with data processing by deep neural networks with only a minor decrease in average accuracy. Simultaneously, reducing the model size may have a…

Machine Learning · Computer Science 2021-09-28 Sebastian Cygert , Andrzej Czyżewski

The recent advances in deep neural networks (DNNs) make them attractive for embedded systems. However, it can take a long time for DNNs to make an inference on resource-constrained computing devices. Model compression techniques can address…

Machine Learning · Computer Science 2018-10-23 Qing Qin , Jie Ren , Jialong Yu , Ling Gao , Hai Wang , Jie Zheng , Yansong Feng , Jianbin Fang , Zheng Wang

Embedding techniques have become essential components of large databases in the deep learning era. By encoding discrete entities, such as words, items, or graph nodes, into continuous vector spaces, embeddings facilitate more efficient…

Information Retrieval · Computer Science 2024-10-18 Shiwei Li , Zhuoqi Hu , Xing Tang , Haozhao Wang , Shijie Xu , Weihong Luo , Yuhua Li , Xiuqiang He , Ruixuan Li

The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume…

Machine Learning · Statistics 2021-06-24 Ajil Jalal , Liu Liu , Alexandros G. Dimakis , Constantine Caramanis

After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands. When compressing, it is desirable to preserve the original model's per-example decisions (e.g., to go beyond top-1…

Machine Learning · Computer Science 2022-10-18 Jerry Chee , Megan Renz , Anil Damle , Christopher De Sa

Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive…

Machine Learning · Computer Science 2024-03-13 Soo Min Kwon , Zekai Zhang , Dogyoon Song , Laura Balzano , Qing Qu

Recently, an extensive amount of research has been focused on compressing and accelerating Deep Neural Networks (DNN). So far, high compression rate algorithms require part of the training dataset for a low precision calibration, or a…

Machine Learning · Computer Science 2020-04-08 Matan Haroush , Itay Hubara , Elad Hoffer , Daniel Soudry

Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model…

Machine Learning · Computer Science 2017-07-06 Miguel Á. Carreira-Perpiñán

Compression techniques for deep neural network models are becoming very important for the efficient execution of high-performance deep learning systems on edge-computing devices. The concept of model compression is also important for…

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models. This growing demand highlights the importance of training models without human supervision, while…

Machine Learning · Computer Science 2025-05-26 Alexey Boldyrev , Fedor Ratnikov , Andrey Shevelev

Large-scale deep learning models are well-suited for compression. Across a variety of tasks, methods like pruning, quantization, and knowledge distillation have been used to achieve massive reductions in model parameters with only marginal…

Machine Learning · Computer Science 2026-05-18 Pedram Bakhtiarifard , Tong Chen , Jonathan Wenshøj , Erik B Dam , Raghavendra Selvan

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this…

The enormous size of modern deep neural networks makes it challenging to deploy those models in memory and communication limited scenarios. Thus, compressing a trained model without a significant loss in performance has become an…

Information Theory · Computer Science 2019-01-25 Weihao Gao , Yu-Han Liu , Chong Wang , Sewoong Oh

Deep neural networks have achieved strong performance in image classification tasks due to their ability to learn complex patterns from high-dimensional data. However, their large computational and memory requirements often limit deployment…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Sai Shi

Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories,…

Machine Learning · Computer Science 2024-11-04 Diptarka Saha , Zihe Liu , Feng Liang

Model compression is crucial for deployment of neural networks on devices with limited computational and memory resources. Many different methods show comparable accuracy of the compressed model and similar compression rates. However, the…

Machine Learning · Computer Science 2020-08-21 Ben Mussay , Daniel Feldman , Samson Zhou , Vladimir Braverman , Margarita Osadchy

Model compression is a critical area of research in deep learning, in particular in vision, driven by the need to lighten models memory or computational footprints. While numerous methods for model compression have been proposed, most focus…

Machine Learning · Computer Science 2025-04-08 Jeremy Morlier , Mathieu Leonardon , Vincent Gripon

Model compression is generally performed by using quantization, low-rank approximation or pruning, for which various algorithms have been researched in recent years. One fundamental question is: what types of compression work better for a…

Machine Learning · Computer Science 2021-07-12 Miguel Á. Carreira-Perpiñán , Yerlan Idelbayev

We introduce model folding, a novel data-free model compression technique that merges structurally similar neurons across layers, significantly reducing the model size without the need for fine-tuning or access to training data. Unlike…

Machine Learning · Computer Science 2025-08-13 Dong Wang , Haris Šikić , Lothar Thiele , Olga Saukh
‹ Prev 1 2 3 10 Next ›