Related papers: Measuring training variability from stochastic opt…

Robust Nonparametric Hypothesis Testing to Understand Variability in Training Neural Networks

Training a deep neural network (DNN) often involves stochastic optimization, which means each run will produce a different model. Several works suggest this variability is negligible when models have the same performance, which in the case…

Machine Learning · Statistics 2023-10-03 Sinjini Banerjee , Reilly Cannon , Tim Marrinan , Tony Chiang , Anand D. Sarwate

A Three-regime Model of Network Pruning

Recent work has highlighted the complex influence training hyperparameters, e.g., the number of training epochs, can have on the prunability of machine learning models. Perhaps surprisingly, a systematic approach to predict precisely how…

Machine Learning · Statistics 2024-03-04 Yefan Zhou , Yaoqing Yang , Arin Chang , Michael W. Mahoney

Importance Sampling for Stochastic Gradient Descent in Deep Neural Networks

Stochastic gradient descent samples uniformly the training set to build an unbiased gradient estimate with a limited number of samples. However, at a given step of the training process, some data are more helpful than others to continue…

Machine Learning · Computer Science 2023-03-30 Thibault Lahire

Doubly-robust and heteroscedasticity-aware sample trimming for causal inference

A popular method for variance reduction in observational causal inference is propensity-based trimming, the practice of removing units with extreme propensities from the sample. This practice has theoretical grounding when the data are…

Methodology · Statistics 2024-01-30 Samir Khan , Johan Ugander

Accurate Neural Network Pruning Requires Rethinking Sparse Optimization

Obtaining versions of deep neural networks that are both highly-accurate and highly-sparse is one of the main challenges in the area of model compression, and several high-performance pruning techniques have been investigated by the…

Machine Learning · Computer Science 2023-09-11 Denis Kuznedelev , Eldar Kurtic , Eugenia Iofinova , Elias Frantar , Alexandra Peste , Dan Alistarh

Not All Samples Are Created Equal: Deep Learning with Importance Sampling

Deep neural network training spends most of the computation on examples that are properly handled, and could be ignored. We propose to mitigate this phenomenon with a principled importance sampling scheme that focuses computation on…

Machine Learning · Computer Science 2019-10-29 Angelos Katharopoulos , François Fleuret

Holistic Deep Learning

This paper presents a novel holistic deep learning framework that simultaneously addresses the challenges of vulnerability to input perturbations, overparametrization, and performance instability from different train-validation splits. The…

Machine Learning · Computer Science 2023-03-22 Dimitris Bertsimas , Kimberly Villalobos Carballo , Léonard Boussioux , Michael Lingzhi Li , Alex Paskov , Ivan Paskov

Investigating the Relationship Between Dropout Regularization and Model Complexity in Neural Networks

Dropout Regularization, serving to reduce variance, is nearly ubiquitous in Deep Learning models. We explore the relationship between the dropout rate and model complexity by training 2,000 neural networks configured with random…

Machine Learning · Computer Science 2021-08-30 Christopher Sun , Jai Sharma , Milind Maiti

Test Sample Accuracy Scales with Training Sample Density in Neural Networks

Intuitively, one would expect accuracy of a trained neural network's prediction on test samples to correlate with how densely the samples are surrounded by seen training samples in representation space. We find that a bound on empirical…

Machine Learning · Computer Science 2022-07-29 Xu Ji , Razvan Pascanu , Devon Hjelm , Balaji Lakshminarayanan , Andrea Vedaldi

Robust Sampling in Deep Learning

Deep learning requires regularization mechanisms to reduce overfitting and improve generalization. We address this problem by a new regularization method based on distributional robust optimization. The key idea is to modify the…

Machine Learning · Computer Science 2020-06-08 Aurora Cobo Aguilera , Antonio Artés-Rodríguez , Fernando Pérez-Cruz , Pablo Martínez Olmos

A robust approach to model-based classification based on trimming and constraints

In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…

Applications · Statistics 2019-11-20 Andrea Cappozzo , Francesca Greselin , Thomas Brendan Murphy

Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space

Hyperparameter optimization is both a practical issue and an interesting theoretical problem in training of deep architectures. Despite many recent advances the most commonly used methods almost universally involve training multiple and…

Machine Learning · Computer Science 2019-09-10 Vlad Pushkarov , Jonathan Efroni , Mykola Maksymenko , Maciej Koch-Janusz

Approach to Finding a Robust Deep Learning Model

The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models. This growing demand highlights the importance of training models without human supervision, while…

Machine Learning · Computer Science 2025-05-26 Alexey Boldyrev , Fedor Ratnikov , Andrey Shevelev

A Bayesian Perspective on Training Speed and Model Selection

We take a Bayesian perspective to illustrate a connection between training speed and the marginal likelihood in linear models. This provides two major insights: first, that a measure of a model's training speed can be used to estimate its…

Machine Learning · Computer Science 2020-10-28 Clare Lyle , Lisa Schut , Binxin Ru , Yarin Gal , Mark van der Wilk

Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy

Neural network pruning is a popular technique used to reduce the inference costs of modern, potentially overparameterized, networks. Starting from a pre-trained network, the process is as follows: remove redundant parameters, retrain, and…

Machine Learning · Computer Science 2021-03-05 Lucas Liebenwein , Cenk Baykal , Brandon Carter , David Gifford , Daniela Rus

Quantity vs. Quality: On Hyperparameter Optimization for Deep Reinforcement Learning

Reinforcement learning algorithms can show strong variation in performance between training runs with different random seeds. In this paper we explore how this affects hyperparameter optimization when the goal is to find hyperparameter…

Machine Learning · Computer Science 2020-07-31 Lars Hertel , Pierre Baldi , Daniel L. Gillen

Uncertainty Measurement of Deep Learning System based on the Convex Hull of Training Sets

Deep Learning (DL) has made remarkable achievements in computer vision and adopted in safety critical domains such as medical imaging or autonomous drive. Thus, it is necessary to understand the uncertainty of the model to effectively…

Computer Vision and Pattern Recognition · Computer Science 2024-05-28 Hyekyoung Hwang , Jitae Shin

A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: Avoiding Unreliable Conclusions

A key trait of stochastic optimizers is that multiple runs of the same optimizer in attempting to solve the same problem can produce different results. As a result, their performance is evaluated over several repeats, or runs, on the…

Machine Learning · Computer Science 2026-05-18 Moslem Noori , Elisabetta Valiante , Thomas Van Vaerenbergh , Masoud Mohseni , Ignacio Rozada

Variational Inference to Measure Model Uncertainty in Deep Neural Networks

We present a novel approach for training deep neural networks in a Bayesian way. Classical, i.e. non-Bayesian, deep learning has two major drawbacks both originating from the fact that network parameters are considered to be deterministic.…

Machine Learning · Statistics 2019-03-11 Konstantin Posch , Jan Steinbrener , Jürgen Pilz

Towards Optimal Neural Networks: the Role of Sample Splitting in Hyperparameter Selection

When artificial neural networks have demonstrated exceptional practical success in a variety of domains, investigations into their theoretical characteristics, such as their approximation power, statistical properties, and generalization…

Machine Learning · Statistics 2023-10-06 Shijin Gong , Xinyu Zhang