English
Related papers

Related papers: Batch-Expansion Training: An Efficient Optimizatio…

200 papers

Bayesian Optimization aims at optimizing an unknown non-convex/concave function that is costly to evaluate. We are interested in application scenarios where concurrent function evaluations are possible. Under such a setting, BO could choose…

Artificial Intelligence · Computer Science 2012-05-02 Javad Azimi , Ali Jalali , Xiaoli Fern

Extending Bayesian optimization to batch evaluation can enable the designer to make the most use of parallel computing technology. However, most of current batch approaches do not scale well with the batch size. That is, their performances…

Machine Learning · Computer Science 2025-04-25 Dawei Zhan , Zhaoxi Zeng , Shuoxiao Wei , Ping Wu

We study the generalization performance of $\text{full-batch}$ optimization algorithms for stochastic convex optimization: these are first-order methods that only access the exact gradient of the empirical risk (rather than gradients with…

Optimization and Control · Mathematics 2021-07-02 Idan Amir , Yair Carmon , Tomer Koren , Roi Livni

Recently, a new trend of exploring sparsity for accelerating neural network training has emerged, embracing the paradigm of training on the edge. This paper proposes a novel Memory-Economic Sparse Training (MEST) framework targeting for…

Recently, convergence as well as convergence rate analyses of deep learning optimizers for nonconvex optimization have been widely studied. Meanwhile, numerical evaluations for the optimizers have precisely clarified the relationship…

Optimization and Control · Mathematics 2021-08-27 Hideaki Iiduka

Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In…

Distributed machine learning is critical for training deep learning models on large datasets with numerous parameters. Current research primarily focuses on leveraging additional hardware resources and powerful computing units to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-03 Kuan-Wei Lu , Ding-Yong Hong , Pangfeng Liu , Jan-Jan Wu

Increasing the mini-batch size for stochastic gradient descent offers significant opportunities to reduce wall-clock training time, but there are a variety of theoretical and systems challenges that impede the widespread success of this…

Time-series forecasting is crucial for numerous real-world applications including weather prediction and financial market modeling. While temporal-domain methods remain prevalent, frequency-domain approaches can effectively capture…

Machine Learning · Computer Science 2025-08-05 Zhixuan Li , Naipeng Chen , Seonghwa Choi , Sanghoon Lee , Weisi Lin

We propose a novel "tree-averaging" model that utilizes the ensemble of classification and regression trees (CART). Each constituent tree is estimated with a subset of similar data. We treat this grouping of subsets as Bayesian ensemble…

Machine Learning · Statistics 2014-08-20 Leo L. Duan , John P. Clancy , Rhonda D. Szczesniak

Existing batch size selection approaches in distributed machine learning rely on static allocation or simplistic heuristics that fail to adapt to heterogeneous, dynamic computing environments. We present DYNAMIX, a reinforcement learning…

Machine Learning · Computer Science 2025-10-10 Yuanjun Dai , Keqiang He , An Wang

In this paper, we consider the problem of stochastic optimization, where the objective function is in terms of the expectation of a (possibly non-convex) cost function that is parametrized by a random variable. While the convergence speed…

Information Theory · Computer Science 2019-10-23 Naeimeh Omidvar , An Liu , Vincent Lau , Danny H. K. Tsang , Mohammad Reza Pakravan

At the heart of contemporary recommender systems (RSs) are latent factor models that provide quality recommendation experience to users. These models use embedding vectors, which are typically of a uniform and fixed size, to represent users…

Information Retrieval · Computer Science 2026-02-05 Yunke Qu , Tong Chen , Quoc Viet Hung Nguyen , Hongzhi Yin

Adversarial training methods commonly generate independent initial perturbation for adversarial samples from a simple uniform distribution, and obtain the training batch for the classifier without selection. In this work, we propose a…

Machine Learning · Computer Science 2024-06-07 Yinting Wu , Pai Peng , Bo Cai , Le Li , .

Batch Normalization (BN) is an important preprocessing step to many deep learning applications. Since it is a data-dependent process, for some homogeneous datasets it is a redundant or even a performance-degrading process. In this paper, we…

Machine Learning · Computer Science 2022-12-01 Wael Alsobhi , Tarik Alafif , Alaa Abdel-Hakim , Weiwei Zong

In designing personalized ranking algorithms, it is desirable to encourage a high precision at the top of the ranked list. Existing methods either seek a smooth convex surrogate for a non-smooth ranking metric or directly modify updating…

Machine Learning · Statistics 2018-08-15 Kuan Liu , Prem Natarajan

Every data selection method inherently has a target. In practice, these targets often emerge implicitly through benchmark-driven iteration: researchers develop selection strategies, train models, measure benchmark performance, then refine…

A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory…

Machine Learning · Computer Science 2019-05-15 Jia Bi , Steve R. Gunn

Data selection is designed to accelerate learning with preserved performance. To achieve this, a fundamental thought is to identify informative data samples with significant contributions to the training. In this work, we propose…

Machine Learning · Computer Science 2025-09-30 Ziheng Cheng , Zhong Li , Jiang Bian

Active learning parallelization is widely used, but typically relies on fixing the batch size throughout experimentation. This fixed approach is inefficient because of a dynamic trade-off between cost and speed -- larger batches are more…

Machine Learning · Computer Science 2024-10-15 Masaki Adachi , Satoshi Hayakawa , Martin Jørgensen , Xingchen Wan , Vu Nguyen , Harald Oberhauser , Michael A. Osborne
‹ Prev 1 2 3 10 Next ›