Related papers: Batch-normalized Maxout Network in Network

Network In Network

We propose a novel deep network structure called "Network In Network" (NIN) to enhance model discriminability for local patches within the receptive field. The conventional convolutional layer uses linear filters followed by a nonlinear…

Neural and Evolutionary Computing · Computer Science 2014-03-05 Min Lin , Qiang Chen , Shuicheng Yan

Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion

Normalization layers are one of the key building blocks for deep neural networks. Several theoretical studies have shown that batch normalization improves the signal propagation, by avoiding the representations from becoming collinear…

Machine Learning · Computer Science 2023-10-04 Alexandru Meterez , Amir Joudaki , Francesco Orabona , Alexander Immer , Gunnar Rätsch , Hadi Daneshmand

MaxDropout: Deep Neural Network Regularization Based on Maximum Output Values

Different techniques have emerged in the deep learning scenario, such as Convolutional Neural Networks, Deep Belief Networks, and Long Short-Term Memory Networks, to cite a few. In lockstep, regularization methods, which aim to prevent…

Machine Learning · Computer Science 2020-07-28 Claudio Filipi Goncalves do Santos , Danilo Colombo , Mateus Roder , João Paulo Papa

Min-Max-Plus Neural Networks

We present a new model of neural networks called Min-Max-Plus Neural Networks (MMP-NNs) based on operations in tropical arithmetic. In general, an MMP-NN is composed of three types of alternately stacked layers, namely linear layers,…

Neural and Evolutionary Computing · Computer Science 2021-02-15 Ye Luo , Shiqing Fan

Improving Deep Neural Networks with Probabilistic Maxout Units

We present a probabilistic variant of the recently introduced maxout unit. The success of deep neural networks utilizing maxout can partly be attributed to favorable performance under dropout, when compared to rectified linear units. It…

Machine Learning · Statistics 2014-02-20 Jost Tobias Springenberg , Martin Riedmiller

Maxout Networks

We consider the problem of designing models to leverage a recently introduced approximate model averaging technique called dropout. We define a simple new model called maxout (so named because its output is the max of a set of inputs, and…

Machine Learning · Statistics 2013-09-23 Ian J. Goodfellow , David Warde-Farley , Mehdi Mirza , Aaron Courville , Yoshua Bengio

On Bridging the Gap between Mean Field and Finite Width in Deep Random Neural Networks with Batch Normalization

Mean field theory is widely used in the theoretical studies of neural networks. In this paper, we analyze the role of depth in the concentration of mean-field predictions, specifically for deep multilayer perceptron (MLP) with batch…

Machine Learning · Computer Science 2023-02-22 Amir Joudaki , Hadi Daneshmand , Francis Bach

Manifold Regularized Deep Neural Networks using Adversarial Examples

Learning meaningful representations using deep neural networks involves designing efficient training schemes and well-structured networks. Currently, the method of stochastic gradient descent that has a momentum with dropout is one of the…

Machine Learning · Computer Science 2016-01-15 Taehoon Lee , Minsuk Choi , Sungroh Yoon

Normalizing the Normalizers: Comparing and Extending Network Normalization Schemes

Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better…

Machine Learning · Computer Science 2017-03-08 Mengye Ren , Renjie Liao , Raquel Urtasun , Fabian H. Sinz , Richard S. Zemel

Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision. Max-pooling purposefully discards precise spatial information in order to create…

Computer Vision and Pattern Recognition · Computer Science 2016-04-19 Sina Honari , Jason Yosinski , Pascal Vincent , Christopher Pal

Batch Layer Normalization, A new normalization layer for CNNs and RNN

This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively…

Machine Learning · Computer Science 2023-01-16 Amir Ziaee , Erion Çano

Functional Network: A Novel Framework for Interpretability of Deep Neural Networks

The layered structure of deep neural networks hinders the use of numerous analysis tools and thus the development of its interpretability. Inspired by the success of functional brain networks, we propose a novel framework for…

Machine Learning · Computer Science 2022-05-25 Ben Zhang , Zhetong Dong , Junsong Zhang , Hongwei Lin

Maxmin convolutional neural networks for image classification

Convolutional neural networks (CNN) are widely used in computer vision, especially in image classification. However, the way in which information and invariance properties are encoded through in deep CNN architectures is still an open…

Computer Vision and Pattern Recognition · Computer Science 2016-10-26 Michael Blot , Matthieu Cord , Nicolas Thome

Layer Normalization

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

An Optimization and Generalization Analysis for Max-Pooling Networks

Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However,…

Machine Learning · Computer Science 2021-03-05 Alon Brutzkus , Amir Globerson

Removing the Feature Correlation Effect of Multiplicative Noise

Multiplicative noise, including dropout, is widely used to regularize deep neural networks (DNNs), and is shown to be effective in a wide range of architectures and tasks. From an information perspective, we consider injecting…

Machine Learning · Computer Science 2018-09-20 Zijun Zhang , Yining Zhang , Zongpeng Li

PowerNorm: Rethinking Batch Normalization in Transformers

The standard normalization method for neural network (NN) models used in Natural Language Processing (NLP) is layer normalization (LN). This is different than batch normalization (BN), which is widely-adopted in Computer Vision. The…

Computation and Language · Computer Science 2021-04-21 Sheng Shen , Zhewei Yao , Amir Gholami , Michael W. Mahoney , Kurt Keutzer

FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization

MLP-like models built entirely upon multi-layer perceptrons have recently been revisited, exhibiting the comparable performance with transformers. It is one of most promising architectures due to the excellent trade-off between network…

Computer Vision and Pattern Recognition · Computer Science 2022-03-25 Kecheng Zheng , Yang Cao , Kai Zhu , Ruijing Zhao , Zheng-Jun Zha

MAXIM: Multi-Axis MLP for Image Processing

Recent progress on Transformers and multi-layer perceptron (MLP) models provide new network architectural designs for computer vision tasks. Although these models proved to be effective in many vision tasks such as image recognition, there…

Image and Video Processing · Electrical Eng. & Systems 2022-04-05 Zhengzhong Tu , Hossein Talebi , Han Zhang , Feng Yang , Peyman Milanfar , Alan Bovik , Yinxiao Li

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates…

Machine Learning · Computer Science 2015-03-03 Sergey Ioffe , Christian Szegedy