Related papers: Reparameterization through Spatial Gradient Scalin…

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning…

Machine Learning · Computer Science 2016-06-07 Tim Salimans , Diederik P. Kingma

Rethinking Vision Transformer Depth via Structural Reparameterization

The computational overhead of Vision Transformers in practice stems fundamentally from their deep architectures, yet existing acceleration strategies have primarily targeted algorithmic-level optimizations such as token pruning and…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Chengwei Zhou , Vipin Chaudhary , Gourav Datta

Reparameterization Gradient for Non-differentiable Models

We present a new algorithm for stochastic variational inference that targets at models with non-differentiable densities. One of the key challenges in stochastic variational inference is to come up with a low-variance estimator of the…

Machine Learning · Computer Science 2018-10-26 Wonyeol Lee , Hangyeol Yu , Hongseok Yang

Reparameterizing Convolutions for Incremental Multi-Task Learning without Task Interference

Multi-task networks are commonly utilized to alleviate the need for a large number of highly specialized single-task networks. However, two common challenges in developing multi-task models are often overlooked in literature. First,…

Computer Vision and Pattern Recognition · Computer Science 2020-07-27 Menelaos Kanakis , David Bruggemann , Suman Saha , Stamatios Georgoulis , Anton Obukhov , Luc Van Gool

Reparameterization trick for discrete variables

Low-variance gradient estimation is crucial for learning directed graphical models parameterized by neural networks, where the reparameterization trick is widely used for those with continuous variables. While this technique gives…

Machine Learning · Statistics 2016-11-07 Seiya Tokui , Issei sato

The Impact of Reinitialization on Generalization in Convolutional Neural Networks

Recent results suggest that reinitializing a subset of the parameters of a neural network during training can improve generalization, particularly for small training sets. We study the impact of different reinitialization methods in several…

Machine Learning · Computer Science 2021-09-02 Ibrahim Alabdulmohsin , Hartmut Maennel , Daniel Keysers

Regularizing Neural Networks via Stochastic Branch Layers

We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches…

Machine Learning · Computer Science 2019-10-04 Wonpyo Park , Paul Hongsuck Seo , Bohyung Han , Minsu Cho

Insights from Gradient Dynamics: Gradient Autoscaled Normalization

Gradient dynamics play a central role in determining the stability and generalization of deep neural networks. In this work, we provide an empirical analysis of how variance and standard deviation of gradients evolve during training,…

Machine Learning · Computer Science 2025-09-09 Vincent-Daniel Yun

Reparameterizing Mirror Descent as Gradient Descent

Most of the recent successful applications of neural networks have been based on training with gradient descent updates. However, for some small networks, other mirror descent updates learn provably more efficiently when the target is…

Machine Learning · Computer Science 2020-06-24 Ehsan Amid , Manfred K. Warmuth

Implicit Regularization for Group Sparsity

We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our…

Machine Learning · Statistics 2023-01-31 Jiangyuan Li , Thanh V. Nguyen , Chinmay Hegde , Raymond K. W. Wong

Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting

In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the…

Computer Vision and Pattern Recognition · Computer Science 2018-12-13 Xialei Liu , Marc Masana , Luis Herranz , Joost Van de Weijer , Antonio M. Lopez , Andrew D. Bagdanov

Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization

Modern deep neural networks are typically highly overparameterized. Pruning techniques are able to remove a significant fraction of network parameters with little loss in accuracy. Recently, techniques based on dynamic reallocation of…

Machine Learning · Computer Science 2019-05-14 Hesham Mostafa , Xin Wang

Powerpropagation: A sparsity inducing weight reparameterisation

The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the…

Machine Learning · Statistics 2021-10-07 Jonathan Schwarz , Siddhant M. Jayakumar , Razvan Pascanu , Peter E. Latham , Yee Whye Teh

Learning Discrete Weights Using the Local Reparameterization Trick

Recent breakthroughs in computer vision make use of large deep neural networks, utilizing the substantial speedup offered by GPUs. For applications running on limited hardware, however, high precision real-time processing can still be a…

Machine Learning · Computer Science 2018-02-05 Oran Shayer , Dan Levi , Ethan Fetaya

Spatially-Adaptive Gradient Re-parameterization for 3D Large Kernel Optimization

Large kernel convolutions offer a scalable alternative to vision transformers for high-resolution 3D volumetric analysis, yet naively increasing kernel size often leads to optimization instability. Motivated by the spatial bias inherent in…

Computer Vision and Pattern Recognition · Computer Science 2026-02-02 Ho Hin Lee , Quan Liu , Shunxing Bao , Yuankai Huo , Bennett A. Landman

Learning fixed points of recurrent neural networks by reparameterizing the network model

In computational neuroscience, fixed points of recurrent neural networks are commonly used to model neural responses to static or slowly changing stimuli. These applications raise the question of how to train the weights in a recurrent…

Neurons and Cognition · Quantitative Biology 2023-07-28 Vicky Zhu , Robert Rosenbaum

Scalable Weight Reparametrization for Efficient Transfer Learning

This paper proposes a novel, efficient transfer learning method, called Scalable Weight Reparametrization (SWR) that is efficient and effective for multiple downstream tasks. Efficient transfer learning involves utilizing a pre-trained…

Machine Learning · Computer Science 2023-02-28 Byeonggeun Kim , Jun-Tae Lee , Seunghan yang , Simyung Chang

Function-space Parameterization of Neural Networks for Sequential Learning

Sequential learning paradigms pose challenges for gradient-based deep learning due to difficulties incorporating new data and retaining prior knowledge. While Gaussian processes elegantly tackle these problems, they struggle with…

Machine Learning · Statistics 2024-03-19 Aidan Scannell , Riccardo Mereu , Paul Chang , Ella Tamir , Joni Pajarinen , Arno Solin

Doubly Reparameterized Importance Weighted Structure Learning for Scene Graph Generation

As a structured prediction task, scene graph generation, given an input image, aims to explicitly model objects and their relationships by constructing a visually-grounded scene graph. In the current literature, such task is universally…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Daqi Liu , Miroslaw Bober , Josef Kittler

Weight Compander: A Simple Weight Reparameterization for Regularization

Regularization is a set of techniques that are used to improve the generalization ability of deep neural networks. In this paper, we introduce weight compander (WC), a novel effective method to improve generalization by reparameterizing…

Machine Learning · Computer Science 2023-06-30 Rinor Cakaj , Jens Mehnert , Bin Yang