Related papers: Online Importance Sampling for Stochastic Gradient…
Importance sampling is widely used to improve the efficiency of deep neural network (DNN) training by reducing the variance of gradient estimators. However, efficiently assessing the variance reduction relative to uniform sampling remains…
We introduce a theoretical and practical framework for efficient importance sampling of mini-batch samples for gradient estimation from single and multiple probability distributions. To handle noisy gradients, our framework dynamically…
Stochastic gradient descent samples uniformly the training set to build an unbiased gradient estimate with a limited number of samples. However, at a given step of the training process, some data are more helpful than others to continue…
Importance sampling has been successfully used to accelerate stochastic optimization in many convex problems. However, the lack of an efficient way to calculate the importance still hinders its application to Deep Learning. In this paper,…
Variational inference approximates the posterior distribution of a probabilistic model with a parameterized density by maximizing a lower bound for the model evidence. Modern solutions fit a flexible approximation with stochastic gradient…
Deep neural network training spends most of the computation on examples that are properly handled, and could be ignored. We propose to mitigate this phenomenon with a principled importance sampling scheme that focuses computation on…
Modern stochastic optimization methods often rely on uniform sampling which is agnostic to the underlying characteristics of the data. This might degrade the convergence by yielding estimates that suffer from a high variance. A possible…
Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants - using importance values defined by the complete gradient information which changes…
In modern data analysis, random sampling is an efficient and widely-used strategy to overcome the computational difficulties brought by large sample size. In previous studies, researchers conducted random sampling which is according to the…
Stochastic Gradient Descent (SGD) is one of the most widely used techniques for online optimization in machine learning. In this work, we accelerate SGD by adaptively learning how to sample the most useful training examples at each time…
Driven by applications in telecommunication networks, we explore the simulation task of estimating rare event probabilities for tandem queues in their steady state. Existing literature has recognized that importance sampling methods can be…
In this paper we study simulation-based methods for estimating gradients in stochastic networks. We derive a new method of calculating weak derivative estimator using importance sampling transform, and our method has less computational cost…
We introduce data structures for solving robust regression through stochastic gradient descent (SGD) by sampling gradients with probability proportional to their norm, i.e., importance sampling. Although SGD is widely used for large scale…
Modern computer vision algorithms often rely on very large training datasets. However, it is conceivable that a carefully selected subsample of the dataset is sufficient for training. In this paper, we propose a gradient-based importance…
Selecting data points for model training is critical in machine learning. Effective selection methods can reduce the labeling effort, optimize on-device training for embedded systems with limited data storage, and enhance the model…
Minibatching is a very well studied and highly popular technique in supervised learning, used by practitioners due to its ability to accelerate training through better utilization of parallel processing power and reduction of stochastic…
This paper improves upon existing data pruning methods for image classification by introducing a novel pruning metric and pruning procedure based on importance sampling. The proposed pruning metric explicitly accounts for data separability,…
Sampling is an important tool for estimating large, complex sums and integrals over high dimensional spaces. For instance, important sampling has been used as an alternative to exact methods for inference in belief networks. Ideally, we…
This research embarks on pioneering the integration of gradient sampling optimization techniques, particularly StochGradAdam, into the pruning process of neural networks. Our main objective is to address the significant challenge of…
Computing the exact likelihood of data in large Bayesian networks consisting of thousands of vertices is often a difficult task. When these models contain many deterministic conditional probability tables and when the observed values are…