Related papers: Scaling Laws for Hyperparameter Optimization

Automatic Setting of DNN Hyper-Parameters by Mixing Bayesian Optimization and Tuning Rules

Deep learning techniques play an increasingly important role in industrial and research environments due to their outstanding results. However, the large number of hyper-parameters to be set may lead to errors if they are set manually. The…

Machine Learning · Computer Science 2020-06-04 Michele Fraccaroli , Evelina Lamma , Fabrizio Riguzzi

Predictable Scale: Part I, Step Law -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

The impressive capabilities of Large Language Models (LLMs) across diverse tasks are now well established, yet their effective deployment necessitates careful hyperparameter optimization. Although existing methods have explored the…

Machine Learning · Computer Science 2025-08-20 Houyi Li , Wenzhen Zheng , Qiufeng Wang , Hanshan Zhang , Zili Wang , Shijie Xuyang , Yuantao Fan , Zhenyu Ding , Haoying Wang , Ning Ding , Shuigeng Zhou , Xiangyu Zhang , Daxin Jiang

Scaling Laws Under the Microscope: Predicting Transformer Performance from Small Scale Experiments

Neural scaling laws define a predictable relationship between a model's parameter count and its performance after training in the form of a power law. However, most research to date has not explicitly investigated whether scaling laws can…

Computation and Language · Computer Science 2022-10-19 Maor Ivgi , Yair Carmon , Jonathan Berant

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this…

Machine Learning · Computer Science 2020-03-13 Tong Yu , Hong Zhu

On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice

Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine…

Machine Learning · Computer Science 2022-10-06 Li Yang , Abdallah Shami

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find…

Machine Learning · Statistics 2021-11-29 Bernd Bischl , Martin Binder , Michel Lang , Tobias Pielok , Jakob Richter , Stefan Coors , Janek Thomas , Theresa Ullmann , Marc Becker , Anne-Laure Boulesteix , Difan Deng , Marius Lindauer

Navigating Scaling Laws: Compute Optimality in Adaptive Model Training

In recent years, the state-of-the-art in deep learning has been dominated by very large models that have been pre-trained on vast amounts of data. The paradigm is very simple: investing more computational resources (optimally) leads to…

Machine Learning · Computer Science 2024-05-24 Sotiris Anagnostidis , Gregor Bachmann , Imanol Schlag , Thomas Hofmann

Optimization Hyper-parameter Laws for Large Language Models

Large Language Models have driven significant AI advancements, yet their training is resource-intensive and highly sensitive to hyper-parameter selection. While scaling laws provide valuable guidance on model size and data requirements,…

Machine Learning · Computer Science 2026-05-21 Xingyu Xie , Kuangyu Ding , Shuicheng Yan , Kim-Chuan Toh , Tianwen Wei

Deriving Hyperparameter Scaling Laws via Modern Optimization Theory

Hyperparameter transfer has become an important component of modern large-scale training recipes. Existing methods, such as muP, primarily focus on transfer between model sizes, with transfer across batch sizes and training horizons often…

Machine Learning · Computer Science 2026-03-18 Egor Shulgin , Dimitri von Rütte , Tianyue H. Zhang , Niccolò Ajroldi , Bernhard Schölkopf , Antonio Orvieto

Hyperparameter Tuning for Deep Reinforcement Learning Applications

Reinforcement learning (RL) applications, where an agent can simply learn optimal behaviors by interacting with the environment, are quickly gaining tremendous success in a wide variety of applications from controlling simple pendulums to…

Machine Learning · Computer Science 2022-01-28 Mariam Kiran , Melis Ozyildirim

DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models

We present two novel hyperparameter optimization strategies for optimization of deep learning models with a modular architecture constructed of multiple subnetworks. As complex networks with multiple subnetworks become more frequently…

Machine Learning · Computer Science 2022-02-25 Alex H. Treacher , Albert Montillo

Using Large Language Models for Hyperparameter Optimization

This paper explores the use of foundational large language models (LLMs) in hyperparameter optimization (HPO). Hyperparameters are critical in determining the effectiveness of machine learning models, yet their optimization often relies on…

Machine Learning · Computer Science 2024-11-12 Michael R. Zhang , Nishkrit Desai , Juhan Bae , Jonathan Lorraine , Jimmy Ba

On the Importance of Hyperparameter Optimization for Model-based Reinforcement Learning

Model-based Reinforcement Learning (MBRL) is a promising framework for learning control in a data-efficient manner. MBRL algorithms can be fairly complex due to the separate dynamics modeling and the subsequent planning algorithm, and as a…

Machine Learning · Computer Science 2021-03-01 Baohe Zhang , Raghu Rajan , Luis Pineda , Nathan Lambert , André Biedenkapp , Kurtland Chua , Frank Hutter , Roberto Calandra

Scaling Laws for Differentially Private Language Models

Scaling laws have emerged as important components of large language model (LLM) training as they can predict performance gains through scale, and provide guidance on important hyper-parameter choices that would otherwise be expensive. LLMs…

Machine Learning · Computer Science 2025-02-03 Ryan McKenna , Yangsibo Huang , Amer Sinha , Borja Balle , Zachary Charles , Christopher A. Choquette-Choo , Badih Ghazi , George Kaissis , Ravi Kumar , Ruibo Liu , Da Yu , Chiyuan Zhang

Training Deep Neural Networks by optimizing over nonlocal paths in hyperparameter space

Hyperparameter optimization is both a practical issue and an interesting theoretical problem in training of deep architectures. Despite many recent advances the most commonly used methods almost universally involve training multiple and…

Machine Learning · Computer Science 2019-09-10 Vlad Pushkarov , Jonathan Efroni , Mykola Maksymenko , Maciej Koch-Janusz

Hyperparameters in Reinforcement Learning and How To Tune Them

In order to improve reproducibility, deep reinforcement learning (RL) has been adopting better scientific practices such as standardized evaluation metrics and reporting. However, the process of hyperparameter optimization still varies…

Machine Learning · Computer Science 2023-06-05 Theresa Eimer , Marius Lindauer , Roberta Raileanu

Hyperparameter optimization in deep multi-target prediction

As a result of the ever increasing complexity of configuring and fine-tuning machine learning models, the field of automated machine learning (AutoML) has emerged over the past decade. However, software implementations like Auto-WEKA and…

Machine Learning · Computer Science 2022-11-09 Dimitrios Iliadis , Marcel Wever , Bernard De Baets , Willem Waegeman

How to Upscale Neural Networks with Scaling Law? A Survey and Practical Guidelines

Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law…

Computation and Language · Computer Science 2025-05-28 Ayan Sengupta , Yash Goel , Tanmoy Chakraborty

Scaling Laws for Neural Material Models

Predicting material properties is crucial for designing better batteries, semiconductors, and medical devices. Deep learning helps scientists quickly find promising materials by predicting their energy, forces, and stresses. Companies scale…

Machine Learning · Computer Science 2025-09-29 Akshay Trikha , Kyle Chu , Advait Gosai , Parker Szachta , Eric Weiner

Revisiting Neural Scaling Laws in Language and Vision

The remarkable progress in deep learning in recent years is largely driven by improvements in scale, where bigger models are trained on larger datasets for longer schedules. To predict the benefit of scale empirically, we argue for a more…

Machine Learning · Computer Science 2022-11-02 Ibrahim Alabdulmohsin , Behnam Neyshabur , Xiaohua Zhai