Related papers: Convolution Aware Initialization

Involution: Inverting the Inherence of Convolution for Visual Recognition

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Duo Li , Jie Hu , Changhu Wang , Xiangtai Li , Qi She , Lei Zhu , Tong Zhang , Qifeng Chen

Initialization and Regularization of Factorized Neural Layers

Factorized layers--operations parameterized by products of two or more matrices--occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention…

Machine Learning · Statistics 2022-10-07 Mikhail Khodak , Neil Tenenholtz , Lester Mackey , Nicolò Fusi

Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks

The selection of initial parameter values for gradient-based optimization of deep neural networks is one of the most impactful hyperparameter choices in deep learning systems, affecting both convergence times and model performance. Yet…

Machine Learning · Computer Science 2020-01-17 Wei Hu , Lechao Xiao , Jeffrey Pennington

Data-dependent Initializations of Convolutional Neural Networks

Convolutional Neural Networks spread through computer vision like a wildfire, impacting almost all visual tasks imaginable. Despite this, few researchers dare to train their models from scratch. Most work builds on one of a handful of…

Computer Vision and Pattern Recognition · Computer Science 2016-09-26 Philipp Krähenbühl , Carl Doersch , Jeff Donahue , Trevor Darrell

Convolutional Initialization for Data-Efficient Vision Transformers

Training vision transformer networks on small datasets poses challenges. In contrast, convolutional neural networks (CNNs) can achieve state-of-the-art performance by leveraging their architectural inductive bias. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Jianqiao Zheng , Xueqian Li , Simon Lucey

Dense-Sparse Deep Convolutional Neural Networks Training for Image Denoising

Recently, deep learning methods such as the convolutional neural networks have gained prominence in the area of image denoising. This is owing to their proven ability to surpass state-of-the-art classical image denoising algorithms such as…

Image and Video Processing · Electrical Eng. & Systems 2024-09-02 Basit O. Alawode , Mudassir Masood

Deeply Shared Filter Bases for Parameter-Efficient Convolutional Neural Networks

Modern convolutional neural networks (CNNs) have massive identical convolution blocks, and, hence, recursive sharing of parameters across these blocks has been proposed to reduce the amount of parameters. However, naive sharing of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-23 Woochul Kang , Daeyeon Kim

Depth-Aware Initialization for Stable and Efficient Neural Network Training

In past few years, various initialization schemes have been proposed. These schemes are glorot initialization, He initialization, initialization using orthogonal matrix, random walk method for initialization. Some of these methods stress on…

Machine Learning · Computer Science 2025-09-08 Vijay Pandey

Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

Neural network architectures designed for function parameterization, such as the Bag-of-Functions (BoF) framework, bridge the gap between the expressivity of deep learning and the interpretability of classical signal processing. However,…

Machine Learning · Computer Science 2026-03-18 David Orlando Salazar Torres , Diyar Altinses , Andreas Schwung

Deep Networks from the Principle of Rate Reduction

This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of…

Machine Learning · Computer Science 2020-10-30 Kwan Ho Ryan Chan , Yaodong Yu , Chong You , Haozhi Qi , John Wright , Yi Ma

Variance-Aware Weight Initialization for Point Convolutional Neural Networks

Appropriate weight initialization has been of key importance to successfully train neural networks. Recently, batch normalization has diminished the role of weight initialization by simply normalizing each layer based on batch statistics.…

Computer Vision and Pattern Recognition · Computer Science 2022-08-03 Pedro Hermosilla , Michael Schelling , Tobias Ritschel , Timo Ropinski

The Impact of Reinitialization on Generalization in Convolutional Neural Networks

Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several…

Machine Learning · Computer Science 2021-09-02 Ibrahim Alabdulmohsin , Hartmut Maennel , Daniel Keysers

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve…

Computer Vision and Pattern Recognition · Computer Science 2022-01-05 Sheng Liu , Xiao Li , Yuexiang Zhai , Chong You , Zhihui Zhu , Carlos Fernandez-Granda , Qing Qu

Orthogonal Convolutional Neural Networks

Deep convolutional neural networks are hindered by training instability and feature redundancy towards further performance improvement. A promising solution is to impose orthogonality on convolutional filters. We develop an efficient…

Computer Vision and Pattern Recognition · Computer Science 2020-04-09 Jiayun Wang , Yubei Chen , Rudrasis Chakraborty , Stella X. Yu

Sinusoidal Initialization, Time for a New Start

Initialization plays a critical role in Deep Neural Network training, directly influencing convergence, stability, and generalization. Common approaches such as Glorot and He initializations rely on randomness, which can produce uneven…

Machine Learning · Computer Science 2025-12-11 Alberto Fernández-Hernández , Jose I. Mestre , Manuel F. Dolz , Jose Duato , Enrique S. Quintana-Ortí

MSE-Optimal Neural Network Initialization via Layer Fusion

Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. However, the use of stochastic gradient descent combined with the nonconvexity of the underlying optimization problems renders…

Machine Learning · Computer Science 2020-01-29 Ramina Ghods , Andrew S. Lan , Tom Goldstein , Christoph Studer

Understanding the Covariance Structure of Convolutional Filters

Neural network weights are typically initialized at random from univariate distributions, controlling just the variance of individual weights even in highly-structured operations like convolutions. Recent ViT-inspired convolutional networks…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Asher Trockman , Devin Willmott , J. Zico Kolter

A new initialisation to Control Gradients in Sinusoidal Neural network

Proper initialisation strategy is of primary importance to mitigate gradient explosion or vanishing when training neural networks. Yet, the impact of initialisation parameters still lacks a precise theoretical understanding for several…

Machine Learning · Computer Science 2026-05-12 Andrea Combette , Antoine Venaille , Nelly Pustelnik

Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework

Enforcing orthogonality in neural networks is an antidote for gradient vanishing/exploding problems, sensitivity by adversarial perturbation, and bounding generalization errors. However, many previous approaches are heuristic, and the…

Machine Learning · Computer Science 2021-06-18 Jiahao Su , Wonmin Byeon , Furong Huang

The Power of Linear Combinations: Learning with Random Convolutions

Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Paul Gavrikov , Janis Keuper