Related papers: Compression Implies Generalization

Stronger generalization bounds for deep nets via a compression approach

Deep nets generalize well despite having more parameters than the number of training samples. Recent works try to give an explanation using PAC-Bayes and Margin-based analyses, but do not as yet result in sample complexity bounds better…

Machine Learning · Computer Science 2018-11-28 Sanjeev Arora , Rong Ge , Behnam Neyshabur , Yi Zhang

Understanding Generalization in Deep Learning via Tensor Methods

Deep neural networks generalize well on unseen data though the number of parameters often far exceeds the number of training examples. Recently proposed complexity measures have provided insights to understanding the generalizability in…

Machine Learning · Computer Science 2020-05-12 Jingling Li , Yanchao Sun , Jiahao Su , Taiji Suzuki , Furong Huang

Compression based bound for non-compressed network: unified generalization error analysis of large compressible deep neural network

One of the biggest issues in deep learning theory is the generalization ability of networks with huge model size. The classical learning theory suggests that overparameterized models cause overfitting. However, practically used large deep…

Machine Learning · Computer Science 2020-06-23 Taiji Suzuki , Hiroshi Abe , Tomoaki Nishimura

Non-Vacuous Generalization Bounds at the ImageNet Scale: A PAC-Bayesian Compression Approach

Modern neural networks are highly overparameterized, with capacity to substantially overfit to training data. Nevertheless, these networks often generalize well in practice. It has also been observed that trained networks can often be…

Machine Learning · Statistics 2019-02-26 Wenda Zhou , Victor Veitch , Morgane Austern , Ryan P. Adams , Peter Orbanz

PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization

While there has been progress in developing non-vacuous generalization bounds for deep neural networks, these bounds tend to be uninformative about why deep learning works. In this paper, we develop a compression approach based on…

Machine Learning · Computer Science 2022-11-28 Sanae Lotfi , Marc Finzi , Sanyam Kapoor , Andres Potapczynski , Micah Goldblum , Andrew Gordon Wilson

An Improving Framework of regularization for Network Compression

Deep Neural Networks have achieved remarkable success relying on the developing high computation capability of GPUs and large-scale datasets with increasing network depth and width in image recognition, object detection and many other…

Machine Learning · Computer Science 2020-01-08 E Zhenqian , Gao Weiguo

Compression-aware Training of Deep Networks

In recent years, great progress has been made in a variety of application domains thanks to the development of increasingly deeper neural networks. Unfortunately, the huge number of units of these networks makes them expensive both…

Computer Vision and Pattern Recognition · Computer Science 2018-10-12 Jose M. Alvarez , Mathieu Salzmann

Towards Understanding Generalization via Decomposing Excess Risk Dynamics

Generalization is one of the fundamental issues in machine learning. However, traditional techniques like uniform convergence may be unable to explain generalization under overparameterization. As alternative approaches, techniques based on…

Machine Learning · Computer Science 2022-03-22 Jiaye Teng , Jianhao Ma , Yang Yuan

Model compression as constrained optimization, with application to neural nets. Part I: general framework

Compressing neural nets is an active research problem, given the large size of state-of-the-art nets for tasks such as object recognition, and the computational limits imposed by mobile devices. We give a general formulation of model…

Machine Learning · Computer Science 2017-07-06 Miguel Á. Carreira-Perpiñán

Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds

We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the network's output. Our approach is based on an…

Machine Learning · Computer Science 2019-05-21 Cenk Baykal , Lucas Liebenwein , Igor Gilitschenski , Dan Feldman , Daniela Rus

Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Ability

The primary objective of learning methods is generalization. Classic uniform generalization bounds, which rely on VC-dimension or Rademacher complexity, fail to explain the significant attribute that over-parameterized models in deep…

Machine Learning · Computer Science 2025-03-07 Lijia Yu , Yibo Miao , Yifan Zhu , Xiao-Shan Gao , Lijun Zhang

Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience

The ability of overparameterized deep networks to generalize well has been linked to the fact that stochastic gradient descent (SGD) finds solutions that lie in flat, wide minima in the training loss -- minima where the output of the…

Machine Learning · Computer Science 2019-06-03 Vaishnavh Nagarajan , J. Zico Kolter

An Introduction to Neural Data Compression

Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression…

Machine Learning · Computer Science 2023-08-22 Yibo Yang , Stephan Mandt , Lucas Theis

Adaptive Estimators Show Information Compression in Deep Neural Networks

To improve how neural networks function it is crucial to understand their learning process. The information bottleneck theory of deep learning proposes that neural networks achieve good generalization by compressing their representations to…

Machine Learning · Computer Science 2023-04-03 Ivan Chelombiev , Conor Houghton , Cian O'Donnell

Generalization and Expressivity for Deep Nets

Along with the rapid development of deep learning in practice, the theoretical explanations for its success become urgent. Generalization and expressivity are two widely used measurements to quantify theoretical behaviors of deep learning.…

Machine Learning · Computer Science 2018-03-26 Shao-Bo Lin

Generalization bounds via distillation

This paper theoretically investigates the following empirical phenomenon: given a high-complexity network with poor generalization bounds, one can distill it into a network with nearly identical predictions but low complexity and vastly…

Machine Learning · Computer Science 2021-04-13 Daniel Hsu , Ziwei Ji , Matus Telgarsky , Lan Wang

Model Preserving Compression for Neural Networks

After training complex deep learning models, a common task is to compress the model to reduce compute and storage demands. When compressing, it is desirable to preserve the original model's per-example decisions (e.g., to go beyond top-1…

Machine Learning · Computer Science 2022-10-18 Jerry Chee , Megan Renz , Anil Damle , Christopher De Sa

Attribution Preservation in Network Compression for Reliable Network Interpretation

Neural networks embedded in safety-sensitive applications such as self-driving cars and wearable health monitors rely on two important techniques: input attribution for hindsight analysis and network compression to reduce its size for…

Machine Learning · Computer Science 2020-10-29 Geondo Park , June Yong Yang , Sung Ju Hwang , Eunho Yang

Neural Network Compression Via Sparse Optimization

The compression of deep neural networks (DNNs) to reduce inference cost becomes increasingly important to meet realistic deployment requirements of various applications. There have been a significant amount of work regarding network…

Machine Learning · Computer Science 2020-11-12 Tianyi Chen , Bo Ji , Yixin Shi , Tianyu Ding , Biyi Fang , Sheng Yi , Xiao Tu

A Theoretical Understanding of Neural Network Compression from Sparse Linear Approximation

The goal of model compression is to reduce the size of a large neural network while retaining a comparable performance. As a result, computation and memory costs in resource-limited applications may be significantly reduced by dropping…

Machine Learning · Statistics 2022-11-10 Wenjing Yang , Ganghua Wang , Jie Ding , Yuhong Yang