Related papers: Random Bias Initialization Improves Quantized Trai…

A Sober Look at Neural Network Initializations

Initializing the weights and the biases is a key part of the training process of a neural network. Unlike the subsequent optimization phase, however, the initialization phase has gained only limited attention in the literature. In this…

Machine Learning · Computer Science 2019-09-06 Ingo Steinwart

Rotated Binary Neural Network

Binary Neural Network (BNN) shows its predominance in reducing the complexity of deep neural networks. However, it suffers severe performance degradation. One of the major impediments is the large quantization error between the…

Computer Vision and Pattern Recognition · Computer Science 2020-10-23 Mingbao Lin , Rongrong Ji , Zihan Xu , Baochang Zhang , Yan Wang , Yongjian Wu , Feiyue Huang , Chia-Wen Lin

Training Binary Neural Networks with Real-to-Binary Convolutions

This paper shows how to train binary networks to within a few percent points ($\sim 3-5 \%$) of the full precision counterpart. We first show how to build a strong baseline, which already achieves state-of-the-art accuracy, by combining…

Computer Vision and Pattern Recognition · Computer Science 2020-03-26 Brais Martinez , Jing Yang , Adrian Bulat , Georgios Tzimiropoulos

Training Competitive Binary Neural Networks from Scratch

Convolutional neural networks have achieved astonishing results in different application areas. Various methods that allow us to use these models on mobile and embedded devices have been proposed. Especially binary neural networks are a…

Machine Learning · Computer Science 2018-12-06 Joseph Bethge , Marvin Bornstein , Adrian Loy , Haojin Yang , Christoph Meinel

Regularized Binary Network Training

There is a significant performance gap between Binary Neural Networks (BNNs) and floating point Deep Neural Networks (DNNs). We propose to improve the binary training method, by introducing a new regularization function that encourages…

Machine Learning · Computer Science 2020-04-22 Sajad Darabi , Mouloud Belbahri , Matthieu Courbariaux , Vahid Partovi Nia

Network Binarization via Contrastive Learning

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit. However, there is still a huge performance gap between Binary Neural Networks (BNNs) and their full-precision (FP) counterparts. As…

Computer Vision and Pattern Recognition · Computer Science 2022-07-19 Yuzhang Shang , Dan Xu , Ziliang Zong , Liqiang Nie , Yan Yan

Initialization of ReLUs for Dynamical Isometry

Deep learning relies on good initialization schemes and hyperparameter choices prior to training a neural network. Random weight initializations induce random network ensembles, which give rise to the trainability, training speed, and…

Machine Learning · Statistics 2019-10-25 Rebekka Burkholz , Alina Dubatovka

Binary Neural Networks: A Survey

The binary neural network, largely saving the storage and computation, serves as a promising technique for deploying deep models on resource-limited devices. However, the binarization inevitably causes severe information loss, and even…

Neural and Evolutionary Computing · Computer Science 2020-04-08 Haotong Qin , Ruihao Gong , Xianglong Liu , Xiao Bai , Jingkuan Song , Nicu Sebe

Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks

Recent works have cast some light on the mystery of why deep nets fit any data and generalize despite being very overparametrized. This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and…

Machine Learning · Computer Science 2019-05-28 Sanjeev Arora , Simon S. Du , Wei Hu , Zhiyuan Li , Ruosong Wang

A Deep Conditioning Treatment of Neural Networks

We study the role of depth in training randomly initialized overparameterized neural networks. We give a general result showing that depth improves trainability of neural networks by improving the conditioning of certain kernel matrices of…

Machine Learning · Computer Science 2021-02-18 Naman Agarwal , Pranjal Awasthi , Satyen Kale

Is Feature Diversity Necessary in Neural Network Initialization?

Standard practice in training neural networks involves initializing the weights in an independent fashion. The results of recent work suggest that feature "diversity" at initialization plays an important role in training the network.…

Machine Learning · Computer Science 2020-07-06 Yaniv Blumenfeld , Dar Gilboa , Daniel Soudry

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

Binary Neural Networks (BNNs) have been garnering interest thanks to their compute cost reduction and memory savings. However, BNNs suffer from performance degradation mainly due to the gradient mismatch caused by binarizing activations.…

Machine Learning · Computer Science 2020-02-18 Hyungjun Kim , Kyungsu Kim , Jinseok Kim , Jae-Joon Kim

Bias Correction for Regularized Regression and its Application in Learning with Streaming Data

We propose an approach to reduce the bias of ridge regression and regularization kernel network. When applied to a single data set the new algorithms have comparable learning performance with the original ones. When applied to incremental…

Machine Learning · Statistics 2016-03-17 Qiang Wu

Where Do Large Learning Rates Lead Us?

It is generally accepted that starting neural networks training with large learning rates (LRs) improves generalization. Following a line of research devoted to understanding this effect, we conduct an empirical study in a controlled…

Machine Learning · Computer Science 2024-10-30 Ildus Sadrtdinov , Maxim Kodryan , Eduard Pokonechny , Ekaterina Lobacheva , Dmitry Vetrov

Effects of Initialization Biases on Deep Neural Network Training Dynamics

Untrained large neural networks, just after random initialization, tend to favour a small subset of classes, assigning high predicted probabilities to these few classes and approximately zero probability to all others. This bias, termed…

Machine Learning · Computer Science 2025-11-27 Nicholas Pellegrino , David Szczecina , Paul W. Fieguth

ReBNet: Residual Binarized Neural Network

This paper proposes ReBNet, an end-to-end framework for training reconfigurable binary neural networks on software and developing efficient accelerators for execution on FPGA. Binary neural networks offer an intriguing opportunity for…

Machine Learning · Computer Science 2018-03-29 Mohammad Ghasemzadeh , Mohammad Samragh , Farinaz Koushanfar

Back to Simplicity: How to Train Accurate BNNs from Scratch?

Binary Neural Networks (BNNs) show promising progress in reducing computational and memory costs but suffer from substantial accuracy degradation compared to their real-valued counterparts on large-scale datasets, e.g., ImageNet. Previous…

Machine Learning · Computer Science 2019-06-21 Joseph Bethge , Haojin Yang , Marvin Bornstein , Christoph Meinel

Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning. We show that initial representations generated by common random initializations are sufficiently rich…

Machine Learning · Computer Science 2017-05-23 Amit Daniely , Roy Frostig , Yoram Singer

Faster Predictive Coding Networks via Better Initialization

Research aimed at scaling up neuroscience inspired learning algorithms for neural networks is accelerating. Recently, a key research area has been the study of energy-based learning algorithms such as predictive coding, due to their…

Machine Learning · Computer Science 2026-01-30 Luca Pinchetti , Simon Frieder , Thomas Lukasiewicz , Tommaso Salvatori

Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis

A common method in training neural networks is to initialize all the weights to be independent Gaussian vectors. We observe that by instead initializing the weights into independent pairs, where each pair consists of two identical Gaussian…

Machine Learning · Computer Science 2022-06-28 Alexander Munteanu , Simon Omlor , Zhao Song , David P. Woodruff