Related papers: Layer Folding: Neural Network Depth Reduction usin…

Regularising Deep Networks with Deep Generative Models

We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an…

Machine Learning · Computer Science 2019-10-14 Matthew Willetts , Alexander Camuto , Stephen Roberts , Chris Holmes

Making Models Shallow Again: Jointly Learning to Reduce Non-Linearity and Depth for Latency-Efficient Private Inference

Large number of ReLU and MAC operations of Deep neural networks make them ill-suited for latency and compute-efficient private inference. In this paper, we present a model optimization method that allows a model to learn to be shallow. In…

Machine Learning · Computer Science 2023-04-27 Souvik Kundu , Yuke Zhang , Dake Chen , Peter A. Beerel

LayerCollapse: Adaptive compression of neural networks

Handling the ever-increasing scale of contemporary deep learning and transformer-based models poses a significant challenge. Overparameterized Transformer networks outperform prior art in Natural Language processing and Computer Vision.…

Machine Learning · Computer Science 2024-11-05 Soheil Zibakhsh Shabgahi , Mohammad Sohail Shariff , Farinaz Koushanfar

Hidden Classification Layers: Enhancing linear separability between classes in neural networks layers

In the context of classification problems, Deep Learning (DL) approaches represent state of art. Many DL approaches are based on variations of standard multi-layer feed-forward neural networks. These are also referred to as deep networks.…

Machine Learning · Computer Science 2023-11-21 Andrea Apicella , Francesco Isgrò , Roberto Prevete

Greedy Layerwise Learning Can Scale to ImageNet

Shallow supervised 1-hidden layer neural networks have a number of favorable properties that make them easier to interpret, analyze, and optimize than their deep counterparts, but lack their representational power. Here we use 1-hidden…

Machine Learning · Computer Science 2019-04-24 Eugene Belilovsky , Michael Eickenberg , Edouard Oyallon

Training Thinner and Deeper Neural Networks: Jumpstart Regularization

Neural networks are more expressive when they have multiple layers. In turn, conventional training methods are only successful if the depth does not lead to numerical issues such as exploding or vanishing gradients, which occur less…

Machine Learning · Computer Science 2022-06-07 Carles Riera , Camilo Rey , Thiago Serra , Eloi Puertas , Oriol Pujol

SCAN: A Scalable Neural Networks Framework Towards Compact and Efficient Models

Remarkable achievements have been attained by deep neural networks in various applications. However, the increasing depth and width of such models also lead to explosive growth in both storage and computation, which has restricted the…

Machine Learning · Computer Science 2019-06-11 Linfeng Zhang , Zhanhong Tan , Jiebo Song , Jingwei Chen , Chenglong Bao , Kaisheng Ma

Deep Unfolding Network for Image Super-Resolution

Learning-based single image super-resolution (SISR) methods are continuously showing superior effectiveness and efficiency over traditional model-based methods, largely due to the end-to-end training. However, different from model-based…

Image and Video Processing · Electrical Eng. & Systems 2020-03-24 Kai Zhang , Luc Van Gool , Radu Timofte

Nonlinear Advantage: Trained Networks Might Not Be As Complex as You Think

We perform an empirical study of the behaviour of deep networks when fully linearizing some of its feature channels through a sparsity prior on the overall number of nonlinear units in the network. In experiments on image classification and…

Machine Learning · Computer Science 2023-06-02 Christian H. X. Ali Mehmeti-Göpel , Jan Disselhoff

Manifold Regularization for Memory-Efficient Training of Deep Neural Networks

One of the prevailing trends in the machine- and deep-learning community is to gravitate towards the use of increasingly larger models in order to keep pushing the state-of-the-art performance envelope. This tendency makes access to the…

Machine Learning · Computer Science 2023-05-29 Shadi Sartipi , Edgar A. Bernal

Increasing Depth of Neural Networks for Life-long Learning

Purpose: We propose a novel method for continual learning based on the increasing depth of neural networks. This work explores whether extending neural network depth may be beneficial in a life-long learning setting. Methods: We propose a…

Machine Learning · Computer Science 2023-05-09 Jędrzej Kozal , Michał Woźniak

Training Deep Morphological Neural Networks as Universal Approximators

We investigate deep morphological neural networks (DMNNs). We demonstrate that despite their inherent non-linearity, "linear" activations are essential for DMNNs. To preserve their inherent sparsity, we propose architectures that constraint…

Machine Learning · Computer Science 2025-12-24 Konstantinos Fotopoulos , Petros Maragos

On the Nonlinearity of Layer Normalization

Layer normalization (LN) is a ubiquitous technique in deep learning but our theoretical understanding to it remains elusive. This paper investigates a new theoretical direction for LN, regarding to its nonlinearity and representation…

Machine Learning · Computer Science 2024-06-04 Yunhao Ni , Yuxin Guo , Junlong Jia , Lei Huang

Learning Activation Functions to Improve Deep Neural Networks

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent.…

Neural and Evolutionary Computing · Computer Science 2015-04-22 Forest Agostinelli , Matthew Hoffman , Peter Sadowski , Pierre Baldi

Effective Training of Convolutional Neural Networks with Low-bitwidth Weights and Activations

This paper tackles the problem of training a deep convolutional neural network of both low-bitwidth weights and activations. Optimizing a low-precision network is very challenging due to the non-differentiability of the quantizer, which may…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Bohan Zhuang , Jing Liu , Mingkui Tan , Lingqiao Liu , Ian Reid , Chunhua Shen

Efficient and Flexible Method for Reducing Moderate-size Deep Neural Networks with Condensation

Neural networks have been extensively applied to a variety of tasks, achieving astounding results. Applying neural networks in the scientific field is an important research direction that is gaining increasing attention. In scientific…

Machine Learning · Computer Science 2024-07-02 Tianyi Chen , Zhi-Qin John Xu

Deep Networks with Stochastic Depth

Very deep convolutional networks with hundreds of layers have led to significant reductions in error on competitive benchmarks. Although the unmatched expressiveness of the many layers can be highly desirable at test time, training very…

Machine Learning · Computer Science 2016-08-01 Gao Huang , Yu Sun , Zhuang Liu , Daniel Sedra , Kilian Weinberger

Non-deep Networks

Depth is the hallmark of deep neural networks. But more depth means more sequential computation and higher latency. This begs the question -- is it possible to build high-performing "non-deep" neural networks? We show that it is. To do so,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-18 Ankit Goyal , Alexey Bochkovskiy , Jia Deng , Vladlen Koltun

From Deep to Shallow: Transformations of Deep Rectifier Networks

In this paper, we introduce transformations of deep rectifier networks, enabling the conversion of deep rectifier networks into shallow rectifier networks. We subsequently prove that any rectifier net of any depth can be represented by a…

Machine Learning · Computer Science 2017-03-31 Senjian An , Farid Boussaid , Mohammed Bennamoun , Jiankun Hu

Generalization of an Upper Bound on the Number of Nodes Needed to Achieve Linear Separability

An important issue in neural network research is how to choose the number of nodes and layers such as to solve a classification problem. We provide new intuitions based on earlier results by An et al. (2015) by deriving an upper bound on…

Machine Learning · Statistics 2018-02-13 Marjolein Troost , Katja Seeliger , Marcel van Gerven