Related papers: Hyperparameter optimization with REINFORCE and Tra…

Efficient Model Performance Estimation via Feature Histories

An important step in the task of neural network design, such as hyper-parameter optimization (HPO) or neural architecture search (NAS), is the evaluation of a candidate model's performance. Given fixed computational resources, one can…

Machine Learning · Computer Science 2021-03-09 Shengcao Cao , Xiaofang Wang , Kris Kitani

Improving Hyperparameter Optimization by Planning Ahead

Hyperparameter optimization (HPO) is generally treated as a bi-level optimization problem that involves fitting a (probabilistic) surrogate model to a set of observed hyperparameter responses, e.g. validation loss, and consequently…

Machine Learning · Computer Science 2021-10-18 Hadi S. Jomaa , Jonas Falkner , Lars Schmidt-Thieme

Efficient-Husformer: Efficient Multimodal Transformer Hyperparameter Optimization for Stress and Cognitive Loads

Transformer-based models have gained considerable attention in the field of physiological signal analysis. They leverage long-range dependencies and complex patterns in temporal signals, allowing them to achieve performance superior to…

Machine Learning · Computer Science 2025-12-01 Merey Orazaly , Fariza Temirkhanova , Jurn-Gyu Park

Hyperparameter Optimization in Neural Networks via Structured Sparse Recovery

In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods. In the first part of this…

Machine Learning · Computer Science 2020-07-09 Minsu Cho , Mohammadreza Soltani , Chinmay Hegde

GRPOformer: Advancing Hyperparameter Optimization via Group Relative Policy Optimization

Hyperparameter optimization (HPO) plays a critical role in improving model performance. Transformer-based HPO methods have shown great potential; however, existing approaches rely heavily on large-scale historical optimization trajectories…

Machine Learning · Computer Science 2025-09-23 Haoxin Guo , Jiawen Pan , Weixin Zhai

LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models

The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints…

Machine Learning · Computer Science 2022-10-19 Mojan Javaheripi , Gustavo H. de Rosa , Subhabrata Mukherjee , Shital Shah , Tomasz L. Religa , Caio C. T. Mendes , Sebastien Bubeck , Farinaz Koushanfar , Debadeepta Dey

Fairer and More Accurate Tabular Models Through NAS

Making models algorithmically fairer in tabular data has been long studied, with techniques typically oriented towards fixes which usually take a neural model with an undesirable outcome and make changes to how the data are ingested, what…

Machine Learning · Computer Science 2023-10-19 Richeek Das , Samuel Dooley

L$^{2}$NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning

Neural architecture search (NAS) has achieved remarkable results in deep neural network design. Differentiable architecture search converts the search over discrete architectures into a hyperparameter optimization problem which can be…

Machine Learning · Computer Science 2021-09-28 Keith G. Mills , Fred X. Han , Mohammad Salameh , Seyed Saeed Changiz Rezaei , Linglong Kong , Wei Lu , Shuo Lian , Shangling Jui , Di Niu

NAS-HPO-Bench-II: A Benchmark Dataset on Joint Optimization of Convolutional Neural Network Architecture and Training Hyperparameters

The benchmark datasets for neural architecture search (NAS) have been developed to alleviate the computationally expensive evaluation process and ensure a fair comparison. Recent NAS benchmarks only focus on architecture optimization,…

Machine Learning · Computer Science 2021-10-22 Yoichi Hirose , Nozomu Yoshinari , Shinichi Shirakawa

Towards Accurate and Compact Architectures via Neural Architecture Transformer

Designing effective architectures is one of the key factors behind the success of deep neural networks. Existing deep architectures are either manually designed or automatically searched by some Neural Architecture Search (NAS) methods.…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Yong Guo , Yin Zheng , Mingkui Tan , Qi Chen , Zhipeng Li , Jian Chen , Peilin Zhao , Junzhou Huang

SimQ-NAS: Simultaneous Quantization Policy and Neural Architecture Search

Recent one-shot Neural Architecture Search algorithms rely on training a hardware-agnostic super-network tailored to a specific task and then extracting efficient sub-networks for different hardware platforms. Popular approaches separate…

Machine Learning · Computer Science 2023-12-22 Sharath Nittur Sridhar , Maciej Szankin , Fang Chen , Sairam Sundaresan , Anthony Sarah

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. Learning HR representations is typically ignored in previous Neural Architecture Search (NAS) methods that…

Computer Vision and Pattern Recognition · Computer Science 2021-06-15 Mingyu Ding , Xiaochen Lian , Linjie Yang , Peng Wang , Xiaojie Jin , Zhiwu Lu , Ping Luo

Efficient Transformer-based Hyper-parameter Optimization for Resource-constrained IoT Environments

The hyper-parameter optimization (HPO) process is imperative for finding the best-performing Convolutional Neural Networks (CNNs). The automation process of HPO is characterized by its sizable computational footprint and its lack of…

Machine Learning · Computer Science 2024-05-03 Ibrahim Shaer , Soodeh Nikan , Abdallah Shami

Learning to reinforcement learn for Neural Architecture Search

Reinforcement learning (RL) is a goal-oriented learning solution that has proven to be successful for Neural Architecture Search (NAS) on the CIFAR and ImageNet datasets. However, a limitation of this approach is its high computational…

Neural and Evolutionary Computing · Computer Science 2019-12-04 J. Gomez Robles , J. Vanschoren

Single Cell Training on Architecture Search for Image Denoising

Neural Architecture Search (NAS) for automatically finding the optimal network architecture has shown some success with competitive performances in various computer vision tasks. However, NAS in general requires a tremendous amount of…

Computer Vision and Pattern Recognition · Computer Science 2022-12-14 Bokyeung Lee , Kyungdeuk Ko , Jonghwan Hong , Hanseok Ko

Improving the sample-efficiency of neural architecture search with reinforcement learning

Designing complex architectures has been an essential cogwheel in the revolution deep learning has brought about in the past decade. When solving difficult problems in a datadriven manner, a well-tried approach is to take an architecture…

Machine Learning · Computer Science 2021-10-14 Attila Nagy , Ábel Boros

Efficient Re-parameterization Operations Search for Easy-to-Deploy Network Based on Directional Evolutionary Strategy

Structural re-parameterization (Rep) methods has achieved significant performance improvement on traditional convolutional network. Most current Rep methods rely on prior knowledge to select the reparameterization operations. However, the…

Artificial Intelligence · Computer Science 2022-07-05 Xinyi Yu , Xiaowei Wang , Jintao Rong , Mingyang Zhang , Linlin Ou

From Regression to Inference: Meta-Learning Predictors for Neural Architecture Search

Prediction-based approaches are widely used in neural architecture search (NAS), where a predictor estimates the performance of candidate architectures to guide selection. However, existing predictors are typically trained via supervised…

Machine Learning · Computer Science 2026-05-12 Liping Deng , MingQing Xiao

Neural Architecture Search on Efficient Transformers and Beyond

Recently, numerous efficient Transformers have been proposed to reduce the quadratic computational complexity of standard Transformers caused by the Softmax attention. However, most of them simply swap Softmax with an efficient attention…

Computation and Language · Computer Science 2022-07-29 Zexiang Liu , Dong Li , Kaiyue Lu , Zhen Qin , Weixuan Sun , Jiacheng Xu , Yiran Zhong

TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

With the extensive applications of machine learning models, automatic hyperparameter optimization (HPO) has become increasingly important. Motivated by the tuning behaviors of human experts, it is intuitive to leverage auxiliary knowledge…

Machine Learning · Computer Science 2022-06-07 Yang Li , Yu Shen , Huaijun Jiang , Wentao Zhang , Zhi Yang , Ce Zhang , Bin Cui