English
Related papers

Related papers: Hyperparameter optimization with REINFORCE and Tra…

200 papers

An important step in the task of neural network design, such as hyper-parameter optimization (HPO) or neural architecture search (NAS), is the evaluation of a candidate model's performance. Given fixed computational resources, one can…

Machine Learning · Computer Science 2021-03-09 Shengcao Cao , Xiaofang Wang , Kris Kitani

Hyperparameter optimization (HPO) is generally treated as a bi-level optimization problem that involves fitting a (probabilistic) surrogate model to a set of observed hyperparameter responses, e.g. validation loss, and consequently…

Machine Learning · Computer Science 2021-10-18 Hadi S. Jomaa , Jonas Falkner , Lars Schmidt-Thieme

Transformer-based models have gained considerable attention in the field of physiological signal analysis. They leverage long-range dependencies and complex patterns in temporal signals, allowing them to achieve performance superior to…

Machine Learning · Computer Science 2025-12-01 Merey Orazaly , Fariza Temirkhanova , Jurn-Gyu Park

In this paper, we study two important problems in the automated design of neural networks -- Hyper-parameter Optimization (HPO), and Neural Architecture Search (NAS) -- through the lens of sparse recovery methods. In the first part of this…

Machine Learning · Computer Science 2020-07-09 Minsu Cho , Mohammadreza Soltani , Chinmay Hegde

Hyperparameter optimization (HPO) plays a critical role in improving model performance. Transformer-based HPO methods have shown great potential; however, existing approaches rely heavily on large-scale historical optimization trajectories…

Machine Learning · Computer Science 2025-09-23 Haoxin Guo , Jiawen Pan , Weixin Zhai

The Transformer architecture is ubiquitously used as the building block of large-scale autoregressive language models. However, finding architectures with the optimal trade-off between task performance (perplexity) and hardware constraints…

Making models algorithmically fairer in tabular data has been long studied, with techniques typically oriented towards fixes which usually take a neural model with an undesirable outcome and make changes to how the data are ingested, what…

Machine Learning · Computer Science 2023-10-19 Richeek Das , Samuel Dooley

Neural architecture search (NAS) has achieved remarkable results in deep neural network design. Differentiable architecture search converts the search over discrete architectures into a hyperparameter optimization problem which can be…

The benchmark datasets for neural architecture search (NAS) have been developed to alleviate the computationally expensive evaluation process and ensure a fair comparison. Recent NAS benchmarks only focus on architecture optimization,…

Machine Learning · Computer Science 2021-10-22 Yoichi Hirose , Nozomu Yoshinari , Shinichi Shirakawa

Designing effective architectures is one of the key factors behind the success of deep neural networks. Existing deep architectures are either manually designed or automatically searched by some Neural Architecture Search (NAS) methods.…

Computer Vision and Pattern Recognition · Computer Science 2021-02-23 Yong Guo , Yin Zheng , Mingkui Tan , Qi Chen , Zhipeng Li , Jian Chen , Peilin Zhao , Junzhou Huang

Recent one-shot Neural Architecture Search algorithms rely on training a hardware-agnostic super-network tailored to a specific task and then extracting efficient sub-networks for different hardware platforms. Popular approaches separate…

Machine Learning · Computer Science 2023-12-22 Sharath Nittur Sridhar , Maciej Szankin , Fang Chen , Sairam Sundaresan , Anthony Sarah

High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. Learning HR representations is typically ignored in previous Neural Architecture Search (NAS) methods that…

Computer Vision and Pattern Recognition · Computer Science 2021-06-15 Mingyu Ding , Xiaochen Lian , Linjie Yang , Peng Wang , Xiaojie Jin , Zhiwu Lu , Ping Luo

The hyper-parameter optimization (HPO) process is imperative for finding the best-performing Convolutional Neural Networks (CNNs). The automation process of HPO is characterized by its sizable computational footprint and its lack of…

Machine Learning · Computer Science 2024-05-03 Ibrahim Shaer , Soodeh Nikan , Abdallah Shami

Reinforcement learning (RL) is a goal-oriented learning solution that has proven to be successful for Neural Architecture Search (NAS) on the CIFAR and ImageNet datasets. However, a limitation of this approach is its high computational…

Neural and Evolutionary Computing · Computer Science 2019-12-04 J. Gomez Robles , J. Vanschoren

Neural Architecture Search (NAS) for automatically finding the optimal network architecture has shown some success with competitive performances in various computer vision tasks. However, NAS in general requires a tremendous amount of…

Computer Vision and Pattern Recognition · Computer Science 2022-12-14 Bokyeung Lee , Kyungdeuk Ko , Jonghwan Hong , Hanseok Ko

Designing complex architectures has been an essential cogwheel in the revolution deep learning has brought about in the past decade. When solving difficult problems in a datadriven manner, a well-tried approach is to take an architecture…

Machine Learning · Computer Science 2021-10-14 Attila Nagy , Ábel Boros

Structural re-parameterization (Rep) methods has achieved significant performance improvement on traditional convolutional network. Most current Rep methods rely on prior knowledge to select the reparameterization operations. However, the…

Artificial Intelligence · Computer Science 2022-07-05 Xinyi Yu , Xiaowei Wang , Jintao Rong , Mingyang Zhang , Linlin Ou

Prediction-based approaches are widely used in neural architecture search (NAS), where a predictor estimates the performance of candidate architectures to guide selection. However, existing predictors are typically trained via supervised…

Machine Learning · Computer Science 2026-05-12 Liping Deng , MingQing Xiao

Recently, numerous efficient Transformers have been proposed to reduce the quadratic computational complexity of standard Transformers caused by the Softmax attention. However, most of them simply swap Softmax with an efficient attention…

Computation and Language · Computer Science 2022-07-29 Zexiang Liu , Dong Li , Kaiyue Lu , Zhen Qin , Weixuan Sun , Jiacheng Xu , Yiran Zhong

With the extensive applications of machine learning models, automatic hyperparameter optimization (HPO) has become increasingly important. Motivated by the tuning behaviors of human experts, it is intuitive to leverage auxiliary knowledge…

Machine Learning · Computer Science 2022-06-07 Yang Li , Yu Shen , Huaijun Jiang , Wentao Zhang , Zhi Yang , Ce Zhang , Bin Cui
‹ Prev 1 2 3 10 Next ›