Related papers: A System for Massively Parallel Hyperparameter Tun…

PASHA: Efficient HPO and NAS with Progressive Resource Allocation

Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run. When models are trained on large datasets, tuning…

Machine Learning · Computer Science 2023-03-09 Ondrej Bohdal , Lukas Balles , Martin Wistuba , Beyza Ermis , Cédric Archambeau , Giovanni Zappella

Resource-Adaptive Successive Doubling for Hyperparameter Optimization with Large Datasets on High-Performance Computing Systems

On High-Performance Computing (HPC) systems, several hyperparameter configurations can be evaluated in parallel to speed up the Hyperparameter Optimization (HPO) process. State-of-the-art HPO methods follow a bandit-based approach and build…

Machine Learning · Computer Science 2025-11-03 Marcel Aach , Rakesh Sarma , Helmut Neukirchen , Morris Riedel , Andreas Lintermann

Multi-objective Asynchronous Successive Halving

Hyperparameter optimization (HPO) is increasingly used to automatically tune the predictive performance (e.g., accuracy) of machine learning models. However, in a plethora of real-world applications, accuracy is only one of the multiple --…

Machine Learning · Statistics 2021-06-25 Robin Schmucker , Michele Donini , Muhammad Bilal Zafar , David Salinas , Cédric Archambeau

Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training

AI accelerator processing capabilities and memory constraints largely dictate the scale in which machine learning workloads (e.g., training and inference) can be executed within a desirable time frame. Training a state of the art,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-12 Michael Benington , Leo Phan , Chris Pierre Paul , Evan Shoemaker , Priyanka Ranade , Torstein Collett , Grant Hodgson Perez , Christopher Krieger

Sherpa: Robust Hyperparameter Optimization for Machine Learning

Sherpa is a hyperparameter optimization library for machine learning models. It is specifically designed for problems with computationally expensive, iterative function evaluations, such as the hyperparameter tuning of deep neural networks.…

Machine Learning · Computer Science 2020-05-11 Lars Hertel , Julian Collado , Peter Sadowski , Jordan Ott , Pierre Baldi

Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach

As the artificial intelligence community advances into the era of large models with billions of parameters, distributed training and inference have become essential. While various parallelism strategies-data, model, sequence, and…

Machine Learning · Computer Science 2025-03-13 Ruifeng She , Bowen Pang , Kai Li , Zehua Liu , Tao Zhong

Hyper-Tune: Towards Efficient Hyper-parameter Tuning at Scale

The ever-growing demand and complexity of machine learning are putting pressure on hyper-parameter tuning systems: while the evaluation cost of models continues to increase, the scalability of state-of-the-arts starts to become a crucial…

Machine Learning · Computer Science 2022-01-19 Yang Li , Yu Shen , Huaijun Jiang , Wentao Zhang , Jixiang Li , Ji Liu , Ce Zhang , Bin Cui

From Black-Box Tuning to Guided Optimization via Hyperparameters Interaction Analysis

Hyperparameters tuning is a fundamental, yet computationally expensive, step in optimizing machine learning models. Beyond optimization, understanding the relative importance and interaction of hyperparameters is critical to efficient model…

Machine Learning · Computer Science 2025-12-23 Moncef Garouani , Ayah Barhrhouj

A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning

We consider learning problems over training sets in which both, the number of training examples and the dimension of the feature vectors, are large. To solve these problems we propose the random parallel stochastic algorithm (RAPSA). We…

Machine Learning · Computer Science 2016-06-17 Aryan Mokhtari , Alec Koppel , Alejandro Ribeiro

Hybrid Algorithm Selection and Hyperparameter Tuning on Distributed Machine Learning Resources: A Hierarchical Agent-based Approach

Algorithm selection and hyperparameter tuning are critical steps in both academic and applied machine learning. On the other hand, these steps are becoming ever increasingly delicate due to the extensive rise in the number, diversity, and…

Machine Learning · Computer Science 2023-09-15 Ahmad Esmaeili , Julia T. Rayz , Eric T. Matson

SHADHO: Massively Scalable Hardware-Aware Distributed Hyperparameter Optimization

Computer vision is experiencing an AI renaissance, in which machine learning models are expediting important breakthroughs in academic research and commercial applications. Effectively training these models, however, is not trivial due in…

Machine Learning · Computer Science 2018-01-23 Jeff Kinnison , Nathaniel Kremer-Herman , Douglas Thain , Walter Scheirer

Adaptive Hyperparameter Optimization for Continual Learning Scenarios

Hyperparameter selection in continual learning scenarios is a challenging and underexplored aspect, especially in practical non-stationary environments. Traditional approaches, such as grid searches with held-out validation data from all…

Machine Learning · Computer Science 2024-06-21 Rudy Semola , Julio Hurtado , Vincenzo Lomonaco , Davide Bacciu

Parallel training of linear models without compromising convergence

In this paper we analyze, evaluate, and improve the performance of training generalized linear models on modern CPUs. We start with a state-of-the-art asynchronous parallel training algorithm, identify system-level performance bottlenecks,…

Machine Learning · Computer Science 2018-12-20 Nikolas Ioannou , Celestine Dünner , Kornilios Kourtis , Thomas Parnell

Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning

Machine learning applications often require hyperparameter tuning. The hyperparameters usually drive both the efficiency of the model training process and the resulting model quality. For hyperparameter tuning, machine learning algorithms…

Machine Learning · Computer Science 2018-08-06 Patrick Koch , Oleg Golovidov , Steven Gardner , Brett Wujek , Joshua Griffin , Yan Xu

Hyperparameter optimization of data-driven AI models on HPC systems

In the European Center of Excellence in Exascale computing "Research on AI- and Simulation-Based Engineering at Exascale" (CoE RAISE), researchers develop novel, scalable AI technologies towards Exascale. This work exercises High…

Data Analysis, Statistics and Probability · Physics 2023-03-01 Eric Wulff , Maria Girone , Joosep Pata

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine…

Machine Learning · Computer Science 2022-10-06 Li Yang , Abdallah Shami

Distributed Training Large-Scale Deep Architectures

Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-21 Shang-Xuan Zou , Chun-Yen Chen , Jui-Lin Wu , Chun-Nan Chou , Chia-Chin Tsao , Kuan-Chieh Tung , Ting-Wei Lin , Cheng-Lung Sung , Edward Y. Chang

Model-Parallel Model Selection for Deep Learning Systems

As deep learning becomes more expensive, both in terms of time and compute, inefficiencies in machine learning (ML) training prevent practical usage of state-of-the-art models for most users. The newest model architectures are simply too…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-15 Kabir Nagrecha

An Adaptive Sampling-based Progressive Hedging Algorithm for Stochastic Programming

The progressive hedging algorithm (PHA) is a cornerstone among algorithms for large-scale stochastic programming problems. However, its traditional implementation is hindered by some limitations, including the requirement to solve all…

Optimization and Control · Mathematics 2025-03-13 Di Zhang , Yihang Zhang , Suvrajeet Sen

Hierarchical Collaborative Hyper-parameter Tuning

Hyper-parameter Tuning is among the most critical stages in building machine learning solutions. This paper demonstrates how multi-agent systems can be utilized to develop a distributed technique for determining near-optimal values for any…

Machine Learning · Computer Science 2022-05-12 Ahmad Esmaeili , Zahra Ghorrati , Eric Matson