Related papers: Data augmentation instead of explicit regularizati…

Further advantages of data augmentation on convolutional neural networks

Data augmentation is a popular technique largely used to enhance the training of convolutional neural networks. Although many of its benefits are well known by deep learning researchers and practitioners, its implicit regularization…

Computer Vision and Pattern Recognition · Computer Science 2019-06-27 Alex Hernández-García , Peter König

Do deep nets really need weight decay and dropout?

The impressive success of modern deep neural networks on computer vision tasks has been achieved through models of very large capacity compared to the number of available training examples. This overparameterization is often said to be…

Computer Vision and Pattern Recognition · Computer Science 2018-07-13 Alex Hernández-García , Peter König

Data Diversity as Implicit Regularization: How Does Diversity Shape the Weight Space of Deep Neural Networks?

Data augmentation that introduces diversity into the input data has long been used in training deep learning models. It has demonstrated benefits in improving robustness and generalization, practically aligning well with other…

Machine Learning · Computer Science 2025-08-18 Yang Ba , Michelle V. Mancenido , Rong Pan

Implicit regularization of dropout

It is important to understand how dropout, a popular regularization method, aids in achieving a good generalization solution during neural network training. In this work, we present a theoretical derivation of an implicit regularization of…

Machine Learning · Computer Science 2023-04-11 Zhongwang Zhang , Zhi-Qin John Xu

Data Dropout in Arbitrary Basis for Deep Network Regularization

An important problem in training deep networks with high capacity is to ensure that the trained network works well when presented with new inputs outside the training dataset. Dropout is an effective regularization technique to boost the…

Computer Vision and Pattern Recognition · Computer Science 2017-12-06 Mostafa Rahmani , George Atia

Data Augmentation Revisited: Rethinking the Distribution Gap between Clean and Augmented Data

Data augmentation has been widely applied as an effective methodology to improve generalization in particular when training deep neural networks. Recently, researchers proposed a few intensive data augmentation techniques, which indeed…

Machine Learning · Computer Science 2019-11-22 Zhuoxun He , Lingxi Xie , Xin Chen , Ya Zhang , Yanfeng Wang , Qi Tian

Deep Augmentation: Dropout as Augmentation for Self-Supervised Learning

Despite dropout's ubiquity in machine learning, its effectiveness as a form of data augmentation remains under-explored. We address two key questions: (i) When is dropout effective as an augmentation strategy? (ii) Is dropout uniquely…

Machine Learning · Computer Science 2025-06-02 Rickard Brüel-Gabrielsson , Tongzhou Wang , Manel Baradad , Justin Solomon

Estimating Implicit Regularization in Deep Learning

Deep learning systems are known to exhibit implicit regularization (alt. implicit bias), favoring simple solutions instead of merely minimizing the loss function. In some cases, we can analytically derive the implicit regularization --…

Machine Learning · Statistics 2026-05-08 Joseph H. Rudoler , Kevin Tan , Giles Hooker , Konrad P. Kording

SoftTarget Regularization: An Effective Technique to Reduce Over-Fitting in Neural Networks

Deep neural networks are learning models with a very high capacity and therefore prone to over-fitting. Many regularization techniques such as Dropout, DropConnect, and weight decay all attempt to solve the problem of over-fitting by…

Machine Learning · Computer Science 2016-12-06 Armen Aghajanyan

The Effects of Regularization and Data Augmentation are Class Dependent

Regularization is a fundamental technique to prevent over-fitting and to improve generalization performances by constraining a model's complexity. Current Deep Networks heavily rely on regularizers such as Data-Augmentation (DA) or…

Machine Learning · Computer Science 2022-04-12 Randall Balestriero , Leon Bottou , Yann LeCun

Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence

Regularization is typically understood as improving generalization by altering the landscape of local extrema to which the model eventually converges. Deep neural networks (DNNs), however, challenge this view: We show that removing…

Machine Learning · Computer Science 2019-06-03 Aditya Golatkar , Alessandro Achille , Stefano Soatto

Explicit Dropout: Deterministic Regularization for Transformer Architectures

Dropout is a widely used regularization technique in deep learning, but its effects are typically realized through stochastic masking rather than explicit optimization objectives. We propose a deterministic formulation that expresses…

Machine Learning · Computer Science 2026-04-23 Vidhi Agrawal , Illia Oleksiienko , Alexandros Iosifidis

The Penalty Imposed by Ablated Data Augmentation

There is a set of data augmentation techniques that ablate parts of the input at random. These include input dropout, cutout, and random erasing. We term these techniques ablated data augmentation. Though these techniques seems similar in…

Machine Learning · Computer Science 2020-06-09 Frederick Liu , Amir Najmi , Mukund Sundararajan

Investigating the Relationship Between Dropout Regularization and Model Complexity in Neural Networks

Dropout Regularization, serving to reduce variance, is nearly ubiquitous in Deep Learning models. We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks configured with random…

Machine Learning · Computer Science 2021-08-30 Christopher Sun , Jai Sharma , Milind Maiti

The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective

Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning. Specific augmentations like translations and scaling in computer vision are traditionally believed to improve generalization by generating…

Machine Learning · Computer Science 2024-02-29 Chi-Heng Lin , Chiraag Kaushik , Eva L. Dyer , Vidya Muthukumar

The Effectiveness of Data Augmentation in Image Classification using Deep Learning

In this paper, we explore and compare multiple solutions to the problem of data augmentation in image classification. Previous work has demonstrated the effectiveness of data augmentation through simple techniques, such as cropping,…

Computer Vision and Pattern Recognition · Computer Science 2017-12-14 Luis Perez , Jason Wang

Why Do We Need Weight Decay in Modern Deep Learning?

Weight decay is a broadly used technique for training state-of-the-art deep networks from image classification to large language models. Despite its widespread usage and being extensively studied in the classical literature, its role…

Machine Learning · Computer Science 2024-11-06 Francesco D'Angelo , Maksym Andriushchenko , Aditya Varre , Nicolas Flammarion

MaxDropout: Deep Neural Network Regularization Based on Maximum Output Values

Different techniques have emerged in the deep learning scenario, such as Convolutional Neural Networks, Deep Belief Networks, and Long Short-Term Memory Networks, to cite a few. In lockstep, regularization methods, which aim to prevent…

Machine Learning · Computer Science 2020-07-28 Claudio Filipi Goncalves do Santos , Danilo Colombo , Mateus Roder , João Paulo Papa

Domain Generalization by Rejecting Extreme Augmentations

Data augmentation is one of the most effective techniques for regularizing deep learning models and improving their recognition performance in a variety of tasks and domains. However, this holds for standard in-domain settings, in which the…

Machine Learning · Computer Science 2025-10-09 Masih Aminbeidokhti , Fidel A. Guerrero Peña , Heitor Rapela Medeiros , Thomas Dubail , Eric Granger , Marco Pedersoli

Regularizing Deep Networks with Semantic Data Augmentation

Data augmentation is widely known as a simple yet surprisingly effective technique for regularizing deep networks. Conventional data augmentation schemes, e.g., flipping, translation or rotation, are low-level, data-independent and…

Computer Vision and Pattern Recognition · Computer Science 2021-06-07 Yulin Wang , Gao Huang , Shiji Song , Xuran Pan , Yitong Xia , Cheng Wu