Related papers: SHADHO: Massively Scalable Hardware-Aware Distribu…

A System for Massively Parallel Hyperparameter Tuning

Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the…

Machine Learning · Computer Science 2020-03-17 Liam Li , Kevin Jamieson , Afshin Rostamizadeh , Ekaterina Gonina , Moritz Hardt , Benjamin Recht , Ameet Talwalkar

Hyper-Parameter Optimization: A Review of Algorithms and Applications

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this…

Machine Learning · Computer Science 2020-03-13 Tong Yu , Hong Zhu

Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training

AI accelerator processing capabilities and memory constraints largely dictate the scale in which machine learning workloads (e.g., training and inference) can be executed within a desirable time frame. Training a state of the art,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-10-12 Michael Benington , Leo Phan , Chris Pierre Paul , Evan Shoemaker , Priyanka Ranade , Torstein Collett , Grant Hodgson Perez , Christopher Krieger

Supervised Online Hashing via Similarity Distribution Learning

Online hashing has attracted extensive research attention when facing streaming data. Most online hashing methods, learning binary codes based on pairwise similarities of training instances, fail to capture the semantic relationship, and…

Computer Vision and Pattern Recognition · Computer Science 2019-06-03 Mingbao Lin , Rongrong Ji , Shen Chen , Feng Zheng , Xiaoshuai Sun , Baochang Zhang , Liujuan Cao , Guodong Guo , Feiyue Huang

A Hardware-Aware Framework for Accelerating Neural Architecture Search Across Modalities

Recent advances in Neural Architecture Search (NAS) such as one-shot NAS offer the ability to extract specialized hardware-aware sub-network configurations from a task-specific super-network. While considerable effort has been employed…

Machine Learning · Computer Science 2022-05-24 Daniel Cummings , Anthony Sarah , Sharath Nittur Sridhar , Maciej Szankin , Juan Pablo Munoz , Sairam Sundaresan

DHO$_2$: Accelerating Distributed Hybrid Order Optimization via Model Parallelism and ADMM

Scaling deep neural network (DNN) training to more devices can reduce time-to-solution. However, it is impractical for users with limited computing resources. FOSI, as a hybrid order optimizer, converges faster than conventional optimizers…

Machine Learning · Computer Science 2025-08-05 Shunxian Gu , Chaoqun You , Bangbang Ren , Lailong Luo , Junxu Xia , Deke Guo

DC and SA: Robust and Efficient Hyperparameter Optimization of Multi-subnetwork Deep Learning Models

We present two novel hyperparameter optimization strategies for optimization of deep learning models with a modular architecture constructed of multiple subnetworks. As complex networks with multiple subnetworks become more frequently…

Machine Learning · Computer Science 2022-02-25 Alex H. Treacher , Albert Montillo

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training

In this paper, we present a novel technique to search for hardware architectures of accelerators optimized for end-to-end training of deep neural networks (DNNs). Our approach addresses both single-device and distributed pipeline and tensor…

Hardware Architecture · Computer Science 2024-04-24 Muhammad Adnan , Amar Phanishayee , Janardhan Kulkarni , Prashant J. Nair , Divya Mahajan

Hippo: Taming Hyper-parameter Optimization of Deep Learning with Stage Trees

Hyper-parameter optimization is crucial for pushing the accuracy of a deep learning model to its limits. A hyper-parameter optimization job, referred to as a study, involves numerous trials of training a model using different training…

Machine Learning · Computer Science 2020-06-23 Ahnjae Shin , Do Yoon Kim , Joo Seong Jeong , Byung-Gon Chun

Optimizing Distributed Training Approaches for Scaling Neural Networks

This paper presents a comparative analysis of distributed training strategies for large-scale neural networks, focusing on data parallelism, model parallelism, and hybrid approaches. We evaluate these strategies on image classification…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-01 Vishnu Vardhan Baligodugula , Fathi Amsaad

A Hardware-Aware System for Accelerating Deep Neural Network Optimization

Recent advances in Neural Architecture Search (NAS) which extract specialized hardware-aware configurations (a.k.a. "sub-networks") from a hardware-agnostic "super-network" have become increasingly popular. While considerable effort has…

Artificial Intelligence · Computer Science 2022-03-01 Anthony Sarah , Daniel Cummings , Sharath Nittur Sridhar , Sairam Sundaresan , Maciej Szankin , Tristan Webb , J. Pablo Munoz

Multi-Objective Neural Architecture Search by Learning Search Space Partitions

Deploying deep learning models requires taking into consideration neural network metrics such as model size, inference latency, and #FLOPs, aside from inference accuracy. This results in deep learning model designers leveraging…

Machine Learning · Computer Science 2024-08-20 Yiyang Zhao , Linnan Wang , Tian Guo

Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization

Artificial neural networks have gone through a recent rise in popularity, achieving state-of-the-art results in various fields, including image classification, speech recognition, and automated control. Both the performance and…

Neural and Evolutionary Computing · Computer Science 2016-11-08 Sean C. Smithson , Guang Yang , Warren J. Gross , Brett H. Meyer

SHEARer: Highly-Efficient Hyperdimensional Computing by Software-Hardware Enabled Multifold Approximation

Hyperdimensional computing (HD) is an emerging paradigm for machine learning based on the evidence that the brain computes on high-dimensional, distributed, representations of data. The main operation of HD is encoding, which transfers the…

Machine Learning · Computer Science 2020-07-22 Behnam Khaleghi , Sahand Salamat , Anthony Thomas , Fatemeh Asgarinejad , Yeseong Kim , Tajana Rosing

Far-HO: A Bilevel Programming Package for Hyperparameter Optimization and Meta-Learning

In (Franceschi et al., 2018) we proposed a unified mathematical framework, grounded on bilevel programming, that encompasses gradient-based hyperparameter optimization and meta-learning. We formulated an approximate version of the problem…

Mathematical Software · Computer Science 2018-06-15 Luca Franceschi , Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo , Paolo Frasconi

HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator

Deep Neural Networks (DNNs) excel in learning hierarchical representations from raw data, such as images, audio, and text. To compute these DNN models with high performance and energy efficiency, these models are usually deployed onto…

Hardware Architecture · Computer Science 2024-06-06 Zhewen Yu , Sudarshan Sreeram , Krish Agrawal , Junyi Wu , Alexander Montgomerie-Corcoran , Cheng Zhang , Jianyi Cheng , Christos-Savvas Bouganis , Yiren Zhao

SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

Hashing methods have been widely used for efficient similarity retrieval on large scale image database. Traditional hashing methods learn hash functions to generate binary codes from hand-crafted features, which achieve limited accuracy…

Computer Vision and Pattern Recognition · Computer Science 2017-11-08 Jian Zhang , Yuxin Peng

HAP: SPMD DNN Training on Heterogeneous GPU Clusters with Automated Program Synthesis

Single-Program-Multiple-Data (SPMD) parallelism has recently been adopted to train large deep neural networks (DNNs). Few studies have explored its applicability on heterogeneous clusters, to fully exploit available resources for large…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-01-12 Shiwei Zhang , Lansong Diao , Chuan Wu , Zongyan Cao , Siyu Wang , Wei Lin

A Novel Approach to Distributed Multi-Class SVM

With data sizes constantly expanding, and with classical machine learning algorithms that analyze such data requiring larger and larger amounts of computation time and storage space, the need to distribute computation and memory…

Machine Learning · Computer Science 2015-12-08 Aruna Govada , Shree Ranjani , Aditi Viswanathan , S. K. Sahay

Unsupervised Semantic Deep Hashing

In recent years, deep hashing methods have been proved to be efficient since it employs convolutional neural network to learn features and hashing codes simultaneously. However, these methods are mostly supervised. In real-world…

Computer Vision and Pattern Recognition · Computer Science 2018-03-20 Sheng Jin