English
Related papers

Related papers: Convolution Aware Initialization

200 papers

Convolution has been the core ingredient of modern neural networks, triggering the surge of deep learning in vision. In this work, we rethink the inherent principles of standard convolution for vision tasks, specifically spatial-agnostic…

Computer Vision and Pattern Recognition · Computer Science 2021-04-13 Duo Li , Jie Hu , Changhu Wang , Xiangtai Li , Qi She , Lei Zhu , Tong Zhang , Qifeng Chen

Factorized layers--operations parameterized by products of two or more matrices--occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention…

Machine Learning · Statistics 2022-10-07 Mikhail Khodak , Neil Tenenholtz , Lester Mackey , Nicolò Fusi

The selection of initial parameter values for gradient-based optimization of deep neural networks is one of the most impactful hyperparameter choices in deep learning systems, affecting both convergence times and model performance. Yet…

Machine Learning · Computer Science 2020-01-17 Wei Hu , Lechao Xiao , Jeffrey Pennington

Convolutional Neural Networks spread through computer vision like a wildfire, impacting almost all visual tasks imaginable. Despite this, few researchers dare to train their models from scratch. Most work builds on one of a handful of…

Computer Vision and Pattern Recognition · Computer Science 2016-09-26 Philipp Krähenbühl , Carl Doersch , Jeff Donahue , Trevor Darrell

Training vision transformer networks on small datasets poses challenges. In contrast, convolutional neural networks (CNNs) can achieve state-of-the-art performance by leveraging their architectural inductive bias. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2024-01-24 Jianqiao Zheng , Xueqian Li , Simon Lucey

Recently, deep learning methods such as the convolutional neural networks have gained prominence in the area of image denoising. This is owing to their proven ability to surpass state-of-the-art classical image denoising algorithms such as…

Image and Video Processing · Electrical Eng. & Systems 2024-09-02 Basit O. Alawode , Mudassir Masood

Modern convolutional neural networks (CNNs) have massive identical convolution blocks, and, hence, recursive sharing of parameters across these blocks has been proposed to reduce the amount of parameters. However, naive sharing of…

Computer Vision and Pattern Recognition · Computer Science 2021-11-23 Woochul Kang , Daeyeon Kim

In past few years, various initialization schemes have been proposed. These schemes are glorot initialization, He initialization, initialization using orthogonal matrix, random walk method for initialization. Some of these methods stress on…

Machine Learning · Computer Science 2025-09-08 Vijay Pandey

Neural network architectures designed for function parameterization, such as the Bag-of-Functions (BoF) framework, bridge the gap between the expressivity of deep learning and the interpretability of classical signal processing. However,…

Machine Learning · Computer Science 2026-03-18 David Orlando Salazar Torres , Diyar Altinses , Andreas Schwung

This work attempts to interpret modern deep (convolutional) networks from the principles of rate reduction and (shift) invariant classification. We show that the basic iterative gradient ascent scheme for optimizing the rate reduction of…

Machine Learning · Computer Science 2020-10-30 Kwan Ho Ryan Chan , Yaodong Yu , Chong You , Haozhi Qi , John Wright , Yi Ma

Appropriate weight initialization has been of key importance to successfully train neural networks. Recently, batch normalization has diminished the role of weight initialization by simply normalizing each layer based on batch statistics.…

Computer Vision and Pattern Recognition · Computer Science 2022-08-03 Pedro Hermosilla , Michael Schelling , Tobias Ritschel , Timo Ropinski

Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several…

Machine Learning · Computer Science 2021-09-02 Ibrahim Alabdulmohsin , Hartmut Maennel , Daniel Keysers

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve…

Computer Vision and Pattern Recognition · Computer Science 2022-01-05 Sheng Liu , Xiao Li , Yuexiang Zhai , Chong You , Zhihui Zhu , Carlos Fernandez-Granda , Qing Qu

Deep convolutional neural networks are hindered by training instability and feature redundancy towards further performance improvement. A promising solution is to impose orthogonality on convolutional filters. We develop an efficient…

Computer Vision and Pattern Recognition · Computer Science 2020-04-09 Jiayun Wang , Yubei Chen , Rudrasis Chakraborty , Stella X. Yu

Initialization plays a critical role in Deep Neural Network training, directly influencing convergence, stability, and generalization. Common approaches such as Glorot and He initializations rely on randomness, which can produce uneven…

Machine Learning · Computer Science 2025-12-11 Alberto Fernández-Hernández , Jose I. Mestre , Manuel F. Dolz , Jose Duato , Enrique S. Quintana-Ortí

Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks. However, the use of stochastic gradient descent combined with the nonconvexity of the underlying optimization problems renders…

Machine Learning · Computer Science 2020-01-29 Ramina Ghods , Andrew S. Lan , Tom Goldstein , Christoph Studer

Neural network weights are typically initialized at random from univariate distributions, controlling just the variance of individual weights even in highly-structured operations like convolutions. Recent ViT-inspired convolutional networks…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Asher Trockman , Devin Willmott , J. Zico Kolter

Proper initialisation strategy is of primary importance to mitigate gradient explosion or vanishing when training neural networks. Yet, the impact of initialisation parameters still lacks a precise theoretical understanding for several…

Machine Learning · Computer Science 2026-05-12 Andrea Combette , Antoine Venaille , Nelly Pustelnik

Enforcing orthogonality in neural networks is an antidote for gradient vanishing/exploding problems, sensitivity by adversarial perturbation, and bounding generalization errors. However, many previous approaches are heuristic, and the…

Machine Learning · Computer Science 2021-06-18 Jiahao Su , Wonmin Byeon , Furong Huang

Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Paul Gavrikov , Janis Keuper
‹ Prev 1 2 3 10 Next ›