Related papers: Mind Mappings: Enabling Efficient Algorithm-Accele…

Demystifying Map Space Exploration for NPUs

Map Space Exploration is the problem of finding optimized mappings of a Deep Neural Network (DNN) model on an accelerator. It is known to be extremely computationally expensive, and there has been active research looking at both heuristics…

Machine Learning · Computer Science 2022-10-10 Sheng-Chun Kao , Angshuman Parashar , Po-An Tsai , Tushar Krishna

The Turbo-Charged Mapper: Fast and Optimal Mapping for Energy-efficient and Low-latency Accelerator Design

The energy and latency of an accelerator running a deep neural network (DNN) depend on how the computation and data movement are scheduled in the accelerator (i.e., mapping), and picking an optimal mapping is essential to achieve…

Hardware Architecture · Computer Science 2026-05-05 Michael Gilbert , Tanner Andrulis , Vivienne Sze , Joel S. Emer

Mapping Space Exploration for Multi-Chiplet Accelerators Targeting LLM Inference Serving Workloads

Large Language Models (LLMs) impose massive computational demands, driving the need for scalable multi-chiplet accelerators. However, existing mapping space exploration efforts for such accelerators primarily focus on traditional…

Hardware Architecture · Computer Science 2026-04-02 Boyu Li , Zongwei Zhu , Yi Xiong , Qianyue Cao , Jiawei Geng , Xiaonan Zhang , Xi Li

Fast and Fusiest: An Optimal Fusion-Aware Mapper for Accelerator Design

A low-latency and energy-efficient tensor algebra accelerator design must optimize how data movement and operations are scheduled (i.e., mapped) in the accelerator architecture. A key mapping optimization is fusion, meaning holding data…

Hardware Architecture · Computer Science 2026-05-05 Tanner Andrulis , Michael Gilbert , Vivienne Sze , Joel S. Emer

Fast-OverlaPIM: A Fast Overlap-driven Mapping Framework for Processing In-Memory Neural Network Acceleration

Processing in-memory (PIM) is promising to accelerate neural networks (NNs) because it minimizes data movement and provides large computational parallelism. Similar to machine learning accelerators, application mapping, which determines the…

Hardware Architecture · Computer Science 2024-07-02 Xuan Wang , Minxuan Zhou , Tajana Rosing

Pack my weights and run! Minimizing overheads for in-memory computing accelerators

In-memory computing hardware accelerators allow more than 10x improvements in peak efficiency and performance for matrix-vector multiplications (MVM) compared to conventional digital designs. For this, they have gained great interest for…

Hardware Architecture · Computer Science 2024-09-19 Pouya Houshmand , Marian Verhelst

A Semi-Decoupled Approach to Fast and Optimal Hardware-Software Co-Design of Neural Accelerators

In view of the performance limitations of fully-decoupled designs for neural architectures and accelerators, hardware-software co-design has been emerging to fully reap the benefits of flexible design spaces and optimize neural network…

Hardware Architecture · Computer Science 2022-03-29 Bingqian Lu , Zheyu Yan , Yiyu Shi , Shaolei Ren

Accelerating Path Planning for Autonomous Driving with Hardware-Assisted Memoization

Path planning for autonomous driving with dynamic obstacles poses a challenge because it needs to perform a higher-dimensional search (with time-dimension) while still meeting real-time constraints. This paper proposes an algorithm-hardware…

Robotics · Computer Science 2022-05-30 Mulong Luo , G. Edward Suh

Efficient Neural Architecture Search via Proximal Iterations

Neural architecture search (NAS) recently attracts much research attention because of its ability to identify better architectures than handcrafted ones. However, many NAS methods, which optimize the search process in a discrete search…

Machine Learning · Computer Science 2019-11-22 Quanming Yao , Ju Xu , Wei-Wei Tu , Zhanxing Zhu

Illuminating search spaces by mapping elites

Many fields use search algorithms, which automatically explore a search space to find high-performing solutions: chemists search through the space of molecules to discover new drugs; engineers search for stronger, cheaper, safer designs,…

Artificial Intelligence · Computer Science 2015-04-21 Jean-Baptiste Mouret , Jeff Clune

AutoSpace: Neural Architecture Search with Less Human Interference

Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction. In this paper, we consider automating the search space design to minimize human interference,…

Computer Vision and Pattern Recognition · Computer Science 2021-03-23 Daquan Zhou , Xiaojie Jin , Xiaochen Lian , Linjie Yang , Yujing Xue , Qibin Hou , Jiashi Feng

Approximating MAP using Local Search

MAP is the problem of finding a most probable instantiation of a set of variables in a Bayesian network, given evidence. Unlike computing marginals, posteriors, and MPE (a special case of MAP), the time and space complexity of MAP is not…

Artificial Intelligence · Computer Science 2013-01-14 James D. Park , Adnan Darwiche

A Full-Stack Search Technique for Domain Optimized Deep Learning Accelerators

The rapidly-changing deep learning landscape presents a unique opportunity for building inference accelerators optimized for specific datacenter-scale workloads. We propose Full-stack Accelerator Search Technique (FAST), a hardware…

Machine Learning · Computer Science 2022-02-02 Dan Zhang , Safeen Huda , Ebrahim Songhori , Kartik Prabhu , Quoc Le , Anna Goldie , Azalia Mirhoseini

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training

In this paper, we present a novel technique to search for hardware architectures of accelerators optimized for end-to-end training of deep neural networks (DNNs). Our approach addresses both single-device and distributed pipeline and tensor…

Hardware Architecture · Computer Science 2024-04-24 Muhammad Adnan , Amar Phanishayee , Janardhan Kulkarni , Prashant J. Nair , Divya Mahajan

State of Compact Architecture Search For Deep Neural Networks

The design of compact deep neural networks is a crucial task to enable widespread adoption of deep neural networks in the real-world, particularly for edge and mobile scenarios. Due to the time-consuming and challenging nature of manually…

Neural and Evolutionary Computing · Computer Science 2019-10-16 Mohammad Javad Shafiee , Andrew Hryniowski , Francis Li , Zhong Qiu Lin , Alexander Wong

LOCAL: Low-Complex Mapping Algorithm for Spatial DNN Accelerators

Deep neural networks are a promising solution for applications that solve problems based on learning data sets. DNN accelerators solve the processing bottleneck as a domain-specific processor. Like other hardware solutions, there must be…

Hardware Architecture · Computer Science 2022-11-08 Midia Reshadi , David Gregg

Learning Space Partitions for Path Planning

Path planning, the problem of efficiently discovering high-reward trajectories, often requires optimizing a high-dimensional and multimodal reward function. Popular approaches like CEM and CMA-ES greedily focus on promising regions of the…

Artificial Intelligence · Computer Science 2022-01-25 Kevin Yang , Tianjun Zhang , Chris Cummins , Brandon Cui , Benoit Steiner , Linnan Wang , Joseph E. Gonzalez , Dan Klein , Yuandong Tian

DeepMapping: Learned Data Mapping for Lossless Compression and Efficient Lookup

Storing tabular data to balance storage and query efficiency is a long-standing research question in the database community. In this work, we argue and show that a novel DeepMapping abstraction, which relies on the impressive memorization…

Databases · Computer Science 2024-09-27 Lixi Zhou , K. Selçuk Candan , Jia Zou

GOMA: Geometrically Optimal Mapping via Analytical Modeling for Spatial Accelerators

General matrix multiplication (GEMM) on spatial accelerators is highly sensitive to mapping choices in both execution efficiency and energy consumption. However, the mapping space exhibits combinatorial explosion, which makes it extremely…

Hardware Architecture · Computer Science 2026-03-24 Wulve Yang , Hailong Zou , Rui Zhou , Jionghao Zhang , Qiang Li , Gang Li , Yi Zhan , Shushan Qiao

Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System

The resurgence of near-memory processing (NMP) with the advent of big data has shifted the computation paradigm from processor-centric to memory-centric computing. To meet the bandwidth and capacity demands of memory-centric computing, 3D…

Hardware Architecture · Computer Science 2021-04-29 Pritam Majumder , Jiayi Huang , Sungkeun Kim , Abdullah Muzahid , Dylan Siegers , Chia-Che Tsai , Eun Jung Kim