Related papers: Why Are Convolutional Nets More Sample-Efficient t…

Data-driven emergence of convolutional structure in neural networks

Exploiting data invariances is crucial for efficient learning in both artificial and biological neural circuits. Understanding how neural networks can discover appropriate representations capable of harnessing the underlying symmetries of…

Disordered Systems and Neural Networks · Physics 2022-10-17 Alessandro Ingrosso , Sebastian Goldt

Computational Separation Between Convolutional and Fully-Connected Networks

Convolutional neural networks (CNN) exhibit unmatched performance in a multitude of computer vision tasks. However, the advantage of using convolutional networks over fully-connected networks is not understood from a theoretical…

Machine Learning · Computer Science 2020-10-06 Eran Malach , Shai Shalev-Shwartz

Neural networks trained with SGD learn distributions of increasing complexity

The ability of deep neural networks to generalise well even when they interpolate their training data has been explained using various "simplicity biases". These theories postulate that neural networks avoid overfitting by first learning…

Machine Learning · Statistics 2023-05-29 Maria Refinetti , Alessandro Ingrosso , Sebastian Goldt

Understanding deep learning requires rethinking generalization

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the…

Machine Learning · Computer Science 2017-02-28 Chiyuan Zhang , Samy Bengio , Moritz Hardt , Benjamin Recht , Oriol Vinyals

Towards Biologically Plausible Convolutional Networks

Convolutional networks are ubiquitous in deep learning. They are particularly useful for images, as they reduce the number of parameters, reduce training time, and increase accuracy. However, as a model of the brain they are seriously…

Machine Learning · Computer Science 2022-01-19 Roman Pogodin , Yash Mehta , Timothy P. Lillicrap , Peter E. Latham

A Gaussian Process perspective on Convolutional Neural Networks

In this paper we cast the well-known convolutional neural network in a Gaussian process perspective. In this way we hope to gain additional insights into the performance of convolutional networks, in particular understand under what…

Machine Learning · Statistics 2019-01-10 Anastasia Borovykh

Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks

Deep convolutional networks have proven to be very successful in learning task specific features that allow for unprecedented performance on various computer vision tasks. Training of such networks follows mostly the supervised learning…

Machine Learning · Computer Science 2015-06-22 Alexey Dosovitskiy , Philipp Fischer , Jost Tobias Springenberg , Martin Riedmiller , Thomas Brox

On the Importance of Sampling in Training GCNs: Tighter Analysis and Variance Reduction

Graph Convolutional Networks (GCNs) have achieved impressive empirical advancement across a wide variety of semi-supervised node classification tasks. Despite their great success, training GCNs on large graphs suffers from computational and…

Machine Learning · Computer Science 2021-11-02 Weilin Cong , Morteza Ramezani , Mehrdad Mahdavi

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, improve on the previous best result in semantic segmentation. Our…

Computer Vision and Pattern Recognition · Computer Science 2016-05-23 Evan Shelhamer , Jonathan Long , Trevor Darrell

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from…

Machine Learning · Computer Science 2020-12-16 Xin Chen , Lingxi Xie , Jun Wu , Longhui Wei , Yuhui Xu , Qi Tian

Deep Networks Favor Simple Data

Estimated density is often interpreted as indicating how typical a sample is under a model. Yet deep models trained on one dataset can assign higher density to simpler out-of-distribution (OOD) data than to in-distribution test data. We…

Machine Learning · Computer Science 2026-04-03 Weyl Lu , Chenjie Hao , Yubei Chen

Fully Convolutional Networks for Semantic Segmentation

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key…

Computer Vision and Pattern Recognition · Computer Science 2015-03-10 Jonathan Long , Evan Shelhamer , Trevor Darrell

Generalisation in Neural Networks Does not Require Feature Overlap

That shared features between train and test data are required for generalisation in artificial neural networks has been a common assumption of both proponents and critics of these models. Here, we show that convolutional architectures avoid…

Neural and Evolutionary Computing · Computer Science 2021-07-15 Jeff Mitchell , Jeffrey S. Bowers

Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks

Neural networks typically generalize well when fitting the data perfectly, even though they are heavily overparameterized. Many factors have been pointed out as the reason for this phenomenon, including an implicit bias of stochastic…

Machine Learning · Computer Science 2025-02-04 Amit Peleg , Matthias Hein

On the Blindspots of Convolutional Networks

Deep convolutional network has been the state-of-the-art approach for a wide variety of tasks over the last few years. Its successes have, in many cases, turned it into the default model in quite a few domains. In this work, we will…

Machine Learning · Statistics 2018-07-10 Elad Hoffer , Shai Fine , Daniel Soudry

Convolutional Neural Networks Can (Meta-)Learn the Same-Different Relation

While convolutional neural networks (CNNs) have come to match and exceed human performance in many settings, the tasks these models optimize for are largely constrained to the level of individual objects, such as classification and…

Computer Vision and Pattern Recognition · Computer Science 2025-09-09 Max Gupta , Sunayana Rane , R. Thomas McCoy , Thomas L. Griffiths

Using and Abusing Equivariance

In this paper we show how Group Equivariant Convolutional Neural Networks use subsampling to learn to break equivariance to their symmetries. We focus on 2D rotations and reflections and investigate the impact of broken equivariance on…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Tom Edixhoven , Attila Lengyel , Jan van Gemert

Generalization Guarantee of Training Graph Convolutional Networks with Graph Topology Sampling

Graph convolutional networks (GCNs) have recently achieved great empirical success in learning graph-structured data. To address its scalability issue due to the recursive embedding of neighboring features, graph topology sampling has been…

Machine Learning · Computer Science 2023-12-12 Hongkang Li , Meng Wang , Sijia Liu , Pin-Yu Chen , Jinjun Xiong

On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks

Convolutional neural networks (CNNs) have achieved remarkable performance in various fields, particularly in the domain of computer vision. However, why this architecture works well remains to be a mystery. In this work we move a small step…

Machine Learning · Computer Science 2019-05-27 Bing Yu , Junzhao Zhang , Zhanxing Zhu

Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent

A major challenge in understanding the generalization of deep learning is to explain why (stochastic) gradient descent can exploit the network architecture to find solutions that have good generalization performance when using high capacity…

Machine Learning · Computer Science 2019-02-12 Yifan Wu , Barnabas Poczos , Aarti Singh