English
Related papers

Related papers: Shake-Shake regularization

200 papers

Regularization is crucial to the success of many practical deep learning models, in particular in a more often than not scenario where there are only a few to a moderate number of accessible training samples. In addition to weight decay,…

Machine Learning · Computer Science 2018-08-07 Che-Wei Huang , Shrikanth S. Narayanan

Deep neural networks have enormous representational power which leads them to overfit on most datasets. Thus, regularizing them is important in order to reduce overfitting and enhance their generalization capabilities. Recently, channel…

Computer Vision and Pattern Recognition · Computer Science 2021-06-18 Sudhakar Kumawat , Gagan Kanojia , Shanmuganathan Raman

In this work, we investigate a recently proposed regularization technique based on multi-branch architectures, called Shake-Shake regularization, for the task of speech emotion recognition. In addition, we also propose variants to…

Sound · Computer Science 2018-04-19 Che-Wei Huang , Shrikanth Narayanan

We introduce a novel stochastic regularization technique for deep neural networks, which decomposes a layer into multiple branches with different parameters and merges stochastically sampled combinations of the outputs from the branches…

Machine Learning · Computer Science 2019-10-04 Wonpyo Park , Paul Hongsuck Seo , Bohyung Han , Minsu Cho

Overfitting is a crucial problem in deep neural networks, even in the latest network architectures. In this paper, to relieve the overfitting effect of ResNet and its improvements (i.e., Wide ResNet, PyramidNet, and ResNeXt), we propose a…

Computer Vision and Pattern Recognition · Computer Science 2020-04-01 Yoshihiro Yamada , Masakazu Iwamura , Takuya Akiba , Koichi Kise

Regularization is commonly used for alleviating overfitting in machine learning. For convolutional neural networks (CNNs), regularization methods, such as DropBlock and Shake-Shake, have illustrated the improvement in the generalization…

Computer Vision and Pattern Recognition · Computer Science 2021-01-01 Yi Wang , Zhen-Peng Bian , Junhui Hou , Lap-Pui Chau

Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates…

Machine Learning · Computer Science 2015-03-03 Sergey Ioffe , Christian Szegedy

Deepfake detection methods based on convolutional neural networks (CNN) have demonstrated high accuracy. \textcolor{black}{However, these methods often suffer from decreased performance when faced with unknown forgery methods and common…

Computer Vision and Pattern Recognition · Computer Science 2023-07-14 Sitong Liu , Zhichao Lian , Siqi Gu , Liang Xiao

Deep convolutional neural networks are known to be unstable during training at high learning rate unless normalization techniques are employed. Normalizing weights or activations allows the use of higher learning rates, resulting in faster…

Machine Learning · Computer Science 2019-12-02 Brendan Ruff , Taylor Beck , Joscha Bach

We propose sequenced-replacement sampling (SRS) for training deep neural networks. The basic idea is to assign a fixed sequence index to each sample in the dataset. Once a mini-batch is randomly drawn in each training iteration, we refill…

Machine Learning · Computer Science 2018-10-22 Chiu Man Ho , Dae Hoon Park , Wei Yang , Yi Chang

Batch normalization was introduced in 2015 to speed up training of deep convolution networks by normalizing the activations across the current batch to have zero mean and unity variance. The results presented here show an interesting aspect…

Computer Vision and Pattern Recognition · Computer Science 2018-02-22 Mohamed Hajaj , Duncan Gillies

Deep neural networks rely heavily on normalization methods to improve their performance and learning behavior. Although normalization methods spurred the development of increasingly deep and efficient architectures, they also increase the…

Machine Learning · Computer Science 2021-10-06 Alexander Fuchs , Christian Knoll , Franz Pernkopf

The successful training of deep neural networks requires addressing challenges such as overfitting, numerical instabilities leading to divergence, and increasing variance in the residual stream. A common solution is to apply regularization…

Machine Learning · Computer Science 2025-11-20 Jörg K. H. Franke , Urs Spiegelhalter , Marianna Nezhurina , Jenia Jitsev , Frank Hutter , Michael Hefenbrock

A key component of most neural network architectures is the use of normalization layers, such as Batch Normalization. Despite its common use and large utility in optimizing deep architectures, it has been challenging both to generically…

Machine Learning · Computer Science 2020-02-17 Cecilia Summers , Michael J. Dinneen

This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions. Based on this theory, a new regularization method in deep learning is derived and shown to outperform previous…

Machine Learning · Statistics 2019-03-08 Kenji Kawaguchi , Yoshua Bengio , Vikas Verma , Leslie Pack Kaelbling

The widespread use of Batch Normalization has enabled training deeper neural networks with more stable and faster results. However, the Batch Normalization works best using large batch size during training and as the state-of-the-art…

Computer Vision and Pattern Recognition · Computer Science 2020-11-24 Martin Kolarik , Radim Burget , Kamil Riha

Recent years have witnessed the success of deep neural networks in dealing with a plenty of practical problems. Dropout has played an essential role in many successful deep neural networks, by inducing regularization in the model training.…

Computer Vision and Pattern Recognition · Computer Science 2019-04-16 Guoliang Kang , Jun Li , Dacheng Tao

Sample efficiency is a crucial problem in deep reinforcement learning. Recent algorithms, such as REDQ and DroQ, found a way to improve the sample efficiency by increasing the update-to-data (UTD) ratio to 20 gradient update steps on the…

Machine Learning · Computer Science 2024-03-26 Aditya Bhatt , Daniel Palenicek , Boris Belousov , Max Argus , Artemij Amiranashvili , Thomas Brox , Jan Peters

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

Overfitting & underfitting and stable training are an important challenges in machine learning. Current approaches for these issues are mixup, SamplePairing and BC learning. In our work, we state the hypothesis that mixing many images…

Machine Learning · Computer Science 2020-01-22 Maciej A. Czyzewski
‹ Prev 1 2 3 10 Next ›