Related papers: Learning Features with Parameter-Free Layers

Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions

Neural networks rely on convolutions to aggregate spatial information. However, spatial convolutions are expensive in terms of model size and computation, both of which grow quadratically with respect to kernel size. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2017-12-05 Bichen Wu , Alvin Wan , Xiangyu Yue , Peter Jin , Sicheng Zhao , Noah Golmant , Amir Gholaminejad , Joseph Gonzalez , Kurt Keutzer

Should You Go Deeper? Optimizing Convolutional Neural Network Architectures without Training by Receptive Field Analysis

When optimizing convolutional neural networks (CNN) for a specific image-based task, specialists commonly overshoot the number of convolutional layers in their designs. By implication, these CNNs are unnecessarily resource intensive to…

Machine Learning · Computer Science 2022-06-23 Mats L. Richter , Julius Schöning , Anna Wiedenroth , Ulf Krumnack

Training CNNs with Selective Allocation of Channels

Recent progress in deep convolutional neural networks (CNNs) have enabled a simple paradigm of architecture design: larger models typically achieve better accuracy. Due to this, in modern CNN architectures, it becomes more important to…

Machine Learning · Computer Science 2019-05-14 Jongheon Jeong , Jinwoo Shin

Replacement Learning: Training Vision Tasks with Fewer Learnable Parameters

Traditional end-to-end deep learning models often enhance feature representation and overall performance by increasing the depth and complexity of the network during training. However, this approach inevitably introduces issues of parameter…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Yuming Zhang , Peizhe Wang , Shouxin Zhang , Dongzhi Guan , Jiabin Liu , Junhao Su

Principled Architecture-aware Scaling of Hyperparameters

Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. Current works try to automatically optimize or design principles of hyperparameters, such that they can…

Machine Learning · Computer Science 2024-02-28 Wuyang Chen , Junru Wu , Zhangyang Wang , Boris Hanin

Feature Products Yield Efficient Networks

We introduce Feature-Product networks (FP-nets) as a novel deep-network architecture based on a new building block inspired by principles of biological vision. For each input feature map, a so-called FP-block learns two different filters,…

Computer Vision and Pattern Recognition · Computer Science 2020-08-19 Philipp Grüning , Thomas Martinetz , Erhardt Barth

The Power of Linear Combinations: Learning with Random Convolutions

Following the traditional paradigm of convolutional neural networks (CNNs), modern CNNs manage to keep pace with more recent, for example transformer-based, models by not only increasing model depth and width but also the kernel size. This…

Computer Vision and Pattern Recognition · Computer Science 2023-06-23 Paul Gavrikov , Janis Keuper

Multi-Path Learnable Wavelet Neural Network for Image Classification

Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 D. D. N. De Silva , H. W. M. K. Vithanage , K. S. D. Fernando , I. T. S. Piyatilake

Sharing the Learned Knowledge-base to Estimate Convolutional Filter Parameters for Continual Image Restoration

Continual learning is an emerging topic in the field of deep learning, where a model is expected to learn continuously for new upcoming tasks without forgetting previous experiences. This field has witnessed numerous advancements, but few…

Computer Vision and Pattern Recognition · Computer Science 2026-03-24 Aupendu Kar , Krishnendu Ghosh , Prabir Kumar Biswas

Using UNet and PSPNet to explore the reusability principle of CNN parameters

How to reduce the requirement on training dataset size is a hot topic in deep learning community. One straightforward way is to reuse some pre-trained parameters. Some previous work like Deep transfer learning reuse the model parameters…

Computer Vision and Pattern Recognition · Computer Science 2020-09-21 Wei Wang

Dynamic Filter Networks

In a traditional convolutional layer, the learned filters stay fixed after training. In contrast, we introduce a new framework, the Dynamic Filter Network, where filters are generated dynamically conditioned on an input. We show that this…

Machine Learning · Computer Science 2016-06-07 Bert De Brabandere , Xu Jia , Tinne Tuytelaars , Luc Van Gool

Multi-task neural networks by learned contextual inputs

This paper explores learned-context neural networks. It is a multi-task learning architecture based on a fully shared neural network and an augmented input vector containing trainable task parameters. The architecture is interesting due to…

Machine Learning · Computer Science 2025-08-07 Anders T. Sandnes , Bjarne Grimstad , Odd Kolbjørnsen

Function Space and Critical Points of Linear Convolutional Networks

We study the geometry of linear networks with one-dimensional convolutional layers. The function spaces of these networks can be identified with semi-algebraic families of polynomials admitting sparse factorizations. We analyze the impact…

Machine Learning · Computer Science 2024-01-29 Kathlén Kohn , Guido Montúfar , Vahid Shahverdi , Matthew Trager

Fast Training of Convolutional Networks through FFTs

Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a…

Computer Vision and Pattern Recognition · Computer Science 2015-06-09 Michael Mathieu , Mikael Henaff , Yann LeCun

Convolutional Neural Networks with Layer Reuse

A convolutional layer in a Convolutional Neural Network (CNN) consists of many filters which apply convolution operation to the input, capture some special patterns and pass the result to the next layer. If the same patterns also occur at…

Computer Vision and Pattern Recognition · Computer Science 2019-02-04 Okan Köpüklü , Maryam Babaee , Stefan Hörmann , Gerhard Rigoll

Learning Implicitly Recurrent CNNs Through Parameter Sharing

We introduce a parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates. Restricting the number of templates…

Machine Learning · Computer Science 2019-03-15 Pedro Savarese , Michael Maire

Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers

State-of-the-art results in large language models (LLMs) often rely on scale, which becomes computationally expensive. This has sparked a research agenda to reduce these models' parameter counts and computational costs without significantly…

Computation and Language · Computer Science 2024-11-07 Xiuying Wei , Skander Moalla , Razvan Pascanu , Caglar Gulcehre

Constraint-free Graphical Model with Fast Learning Algorithm

In this paper, we propose a simple, versatile model for learning the structure and parameters of multivariate distributions from a data set. Learning a Markov network from a given data set is not a simple problem, because Markov networks…

Machine Learning · Computer Science 2012-06-19 Kazuya Takabatake , Shotaro Akaho

The Unreasonable Effectiveness of Fully-Connected Layers for Low-Data Regimes

Convolutional neural networks were the standard for solving many computer vision tasks until recently, when Transformers of MLP-based architectures have started to show competitive performance. These architectures typically have a vast…

Computer Vision and Pattern Recognition · Computer Science 2022-10-14 Peter Kocsis , Peter Súkeník , Guillem Brasó , Matthias Nießner , Laura Leal-Taixé , Ismail Elezi

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy but each image evaluation requires millions of floating point…

Computer Vision and Pattern Recognition · Computer Science 2024-03-15 Remi Denton , Wojciech Zaremba , Joan Bruna , Yann LeCun , Rob Fergus