Related papers: Stochastic Kernel Regularisation Improves Generali…

Convolutional Deep Kernel Machines

Standard infinite-width limits of neural networks sacrifice the ability for intermediate layers to learn representations from data. Recent work (A theory of representation learning gives a deep generalisation of kernel methods, Yang et al.…

Machine Learning · Statistics 2024-02-27 Edward Milsom , Ben Anson , Laurence Aitchison

Regularising Deep Networks with Deep Generative Models

We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an…

Machine Learning · Computer Science 2019-10-14 Matthew Willetts , Alexander Camuto , Stephen Roberts , Chris Holmes

Training Efficient CNNS: Tweaking the Nuts and Bolts of Neural Networks for Lighter, Faster and Robust Models

Deep Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval and more. Many techniques have evolved over the past decade that made models lighter, faster, and…

Machine Learning · Computer Science 2022-05-25 Sabeesh Ethiraj , Bharath Kumar Bolla

Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

Neural kernels have drastically increased performance on diverse and nonstandard data modalities but require significantly more compute, which previously limited their application to smaller datasets. In this work, we address this by…

Machine Learning · Statistics 2023-03-10 Ben Adlam , Jaehoon Lee , Shreyas Padhy , Zachary Nado , Jasper Snoek

Stochastic Optimization of Plain Convolutional Neural Networks with Simple methods

Convolutional neural networks have been achieving the best possible accuracies in many visual pattern classification problems. However, due to the model capacity required to capture such representations, they are often oversensitive to…

Computer Vision and Pattern Recognition · Computer Science 2020-01-27 Yahia Assiri

Towards Understanding Normalization in Neural ODEs

Normalization is an important and vastly investigated technique in deep learning. However, its role for Ordinary Differential Equation based networks (neural ODEs) is still poorly understood. This paper investigates how different…

Machine Learning · Computer Science 2020-04-29 Julia Gusak , Larisa Markeeva , Talgat Daulbaev , Alexandr Katrutsa , Andrzej Cichocki , Ivan Oseledets

Deep neural networks are robust to weight binarization and other non-linear distortions

Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same…

Neural and Evolutionary Computing · Computer Science 2016-06-08 Paul Merolla , Rathinakumar Appuswamy , John Arthur , Steve K. Esser , Dharmendra Modha

Towards Better Orthogonality Regularization with Disentangled Norm in Training Deep CNNs

Orthogonality regularization has been developed to prevent deep CNNs from training instability and feature redundancy. Among existing proposals, kernel orthogonality regularization enforces orthogonality by minimizing the residual between…

Computer Vision and Pattern Recognition · Computer Science 2023-06-19 Changhao Wu , Shenan Zhang , Fangsong Long , Ziliang Yin , Tuo Leng

SinReQ: Generalized Sinusoidal Regularization for Low-Bitwidth Deep Quantized Training

Deep quantization of neural networks (below eight bits) offers significant promise in reducing their compute and storage cost. Albeit alluring, without special techniques for training and optimization, deep quantization results in…

Machine Learning · Computer Science 2019-12-03 Ahmed T. Elthakeb , Prannoy Pilligundla , Hadi Esmaeilzadeh

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve…

Computer Vision and Pattern Recognition · Computer Science 2022-01-05 Sheng Liu , Xiao Li , Yuexiang Zhai , Chong You , Zhihui Zhu , Carlos Fernandez-Granda , Qing Qu

Stochastic Training is Not Necessary for Generalization

It is widely believed that the implicit regularization of SGD is fundamental to the impressive generalization behavior we observe in neural networks. In this work, we demonstrate that non-stochastic full-batch training can achieve…

Machine Learning · Computer Science 2022-04-21 Jonas Geiping , Micah Goldblum , Phillip E. Pope , Michael Moeller , Tom Goldstein

To understand deep learning we need to understand kernel learning

Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on…

Machine Learning · Statistics 2018-06-18 Mikhail Belkin , Siyuan Ma , Soumik Mandal

Stochastic Variational Deep Kernel Learning

Deep kernel learning combines the non-parametric flexibility of kernel methods with the inductive biases of deep learning architectures. We propose a novel deep kernel learning model and stochastic variational inference procedure which…

Machine Learning · Statistics 2016-11-03 Andrew Gordon Wilson , Zhiting Hu , Ruslan Salakhutdinov , Eric P. Xing

Enhanced Convolutional Neural Networks for Improved Image Classification

Image classification is a fundamental task in computer vision with diverse applications, ranging from autonomous systems to medical imaging. The CIFAR-10 dataset is a widely used benchmark to evaluate the performance of classification…

Computer Vision and Pattern Recognition · Computer Science 2025-02-04 Xiaoran Yang , Shuhan Yu , Wenxi Xu

Gradient-Coherent Strong Regularization for Deep Neural Networks

Regularization plays an important role in generalization of deep neural networks, which are often prone to overfitting with their numerous parameters. L1 and L2 regularizers are common regularization tools in machine learning with their…

Machine Learning · Computer Science 2019-10-21 Dae Hoon Park , Chiu Man Ho , Yi Chang , Huaqing Zhang

Structure Learning of Deep Networks via DNA Computing Algorithm

Convolutional Neural Network (CNN) has gained state-of-the-art results in many pattern recognition and computer vision tasks. However, most of the CNN structures are manually designed by experienced researchers. Therefore, auto- matically…

Neural and Evolutionary Computing · Computer Science 2018-10-26 Guoqiang Zhong , Tao Li , Wenxue Liu , Yang Chen

Deep Clustered Convolutional Kernels

Deep neural networks have recently achieved state of the art performance thanks to new training algorithms for rapid parameter estimation and new regularization methods to reduce overfitting. However, in practice the network architecture…

Machine Learning · Computer Science 2016-03-04 Minyoung Kim , Luca Rigazio

Understanding deep learning requires rethinking generalization

Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the…

Machine Learning · Computer Science 2017-02-28 Chiyuan Zhang , Samy Bengio , Moritz Hardt , Benjamin Recht , Oriol Vinyals

Semantic Perturbations with Normalizing Flows for Improved Generalization

Data augmentation is a widely adopted technique for avoiding overfitting when training deep neural networks. However, this approach requires domain-specific knowledge and is often limited to a fixed set of hard-coded transformations.…

Machine Learning · Statistics 2021-08-19 Oguz Kaan Yuksel , Sebastian U. Stich , Martin Jaggi , Tatjana Chavdarova

Stochastic Function Norm Regularization of Deep Networks

Deep neural networks have had an enormous impact on image analysis. State-of-the-art training methods, based on weight decay and DropOut, result in impressive performance when a very large training set is available. However, they tend to…

Machine Learning · Computer Science 2019-09-02 Amal Rannen Triki , Matthew B. Blaschko