Related papers: Heterogeneous Dataflow Accelerators for Multi-DNN …

HYDRA: Hybrid Data Multiplexing and Run-time Layer Configurable DNN Accelerator

Deep neural networks (DNNs) offer plenty of challenges in executing efficient computation at edge nodes, primarily due to the huge hardware resource demands. The article proposes HYDRA, hybrid data multiplexing, and runtime layer…

Hardware Architecture · Computer Science 2026-03-31 Sonu Kumar , Komal Gupta , Gopal Raut , Mukul Lokhande , Santosh Kumar Vishvakarma

DORA: Dataflow-Instruction Orchestration Architecture for DNN Acceleration

As deep neural networks develop significantly more diverse and complex, achieving high performance and efficiency on complicated DNN models faces pressing challenges. Modern DNN workloads are increasingly diverse in operation types, tensor…

Hardware Architecture · Computer Science 2026-05-25 Xingzhen Chen , Zhuoping Yang , Jinming Zhuang , Shixin Ji , Sarah Schultz , Zheng Dong , Weisong Shi , Peipei Zhou

HADAS: Hardware-Aware Dynamic Neural Architecture Search for Edge Performance Scaling

Dynamic neural networks (DyNNs) have become viable techniques to enable intelligence on resource-constrained edge devices while maintaining computational efficiency. In many cases, the implementation of DyNNs can be sub-optimal due to its…

Machine Learning · Computer Science 2022-12-08 Halima Bouzidi , Mohanad Odema , Hamza Ouarnoughi , Mohammad Abdullah Al Faruque , Smail Niar

A Hierarchical WDM-based Scalable Data Center Network Architecture

Massive data centers are at the heart of the Internet. The rapid growth of Internet traffic and the abundance of rich data-driven applications have raised the need for enormous network bandwidth. Towards meeting this growing traffic demand,…

Networking and Internet Architecture · Computer Science 2019-01-29 Maotong Xu , Jelena Diakonikolas , Eytan Modiano , Suresh Subramaniam

InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNs

The rise of deep neural networks (DNNs) has driven an increased demand for computing power and memory. Modern DNNs exhibit high data volume variation (HDV) across tasks, which poses challenges for FPGA acceleration: conventional…

Hardware Architecture · Computer Science 2025-04-08 Zifan He , Anderson Truong , Yingqi Cao , Jason Cong

HASS: Hardware-Aware Sparsity Search for Dataflow DNN Accelerator

Deep Neural Networks (DNNs) excel in learning hierarchical representations from raw data, such as images, audio, and text. To compute these DNN models with high performance and energy efficiency, these models are usually deployed onto…

Hardware Architecture · Computer Science 2024-06-06 Zhewen Yu , Sudarshan Sreeram , Krish Agrawal , Junyi Wu , Alexander Montgomerie-Corcoran , Cheng Zhang , Jianyi Cheng , Christos-Savvas Bouganis , Yiren Zhao

Enabling Flexibility for Sparse Tensor Acceleration via Heterogeneity

Recently, numerous sparse hardware accelerators for Deep Neural Networks (DNNs), Graph Neural Networks (GNNs), and scientific computing applications have been proposed. A common characteristic among all of these accelerators is that they…

Hardware Architecture · Computer Science 2022-01-25 Eric Qin , Raveesh Garg , Abhimanyu Bambhaniya , Michael Pellauer , Angshuman Parashar , Sivasankaran Rajamanickam , Cong Hao , Tushar Krishna

Dynamic DNNs Meet Runtime Resource Management on Mobile and Embedded Platforms

Deep neural network (DNN) inference is increasingly being executed on mobile and embedded platforms due to low latency and better privacy. However, efficient deployment on these platforms is challenging due to the intensive computation and…

Hardware Architecture · Computer Science 2022-06-08 Lei Xun , Bashir M. Al-Hashimi , Jonathon Hare , Geoff V. Merrett

HIDA: A Hierarchical Dataflow Compiler for High-Level Synthesis

Dataflow architectures are growing in popularity due to their potential to mitigate the challenges posed by the memory wall inherent to the Von Neumann architecture. At the same time, high-level synthesis (HLS) has demonstrated its efficacy…

Hardware Architecture · Computer Science 2023-11-08 Hanchen Ye , Hyegang Jun , Deming Chen

Data Allocation in a Heterogeneous Disk Array - HDA with Multiple RAID Levels for Database Applications

We consider the allocation of Virtual Arrays (VAs) in a Heterogeneous Disk Array (HDA). Each VA holds groups of related objects and datasets such as files, relational tables, which has similar performance and availability characteristics.…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-19 Alexander Thomasian , Jun Xu

HyPar: Towards Hybrid Parallelism for Deep Learning Accelerator Array

With the rise of artificial intelligence in recent years, Deep Neural Networks (DNNs) have been widely used in many domains. To achieve high performance and energy efficiency, hardware acceleration (especially inference) of DNNs is…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-17 Linghao Song , Jiachen Mao , Youwei Zhuo , Xuehai Qian , Hai Li , Yiran Chen

Resource Heterogeneity-Aware and Utilization-Enhanced Scheduling for Deep Learning Clusters

Scheduling deep learning (DL) models to train on powerful clusters with accelerators like GPUs and TPUs, presently falls short, either lacking fine-grained heterogeneity awareness or leaving resources substantially under-utilized. To fill…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-17 Abeda Sultana , Nabin Pakka , Fei Xu , Xu Yuan , Li Chen , Nian-Feng Tzeng

Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution

Deep Neural Networks (DNNs) are increasingly deployed across diverse industries, driving demand for mobile device support. However, existing mobile inference frameworks often rely on a single processor per model, limiting hardware…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-28 Yunquan Gao , Zhiguo Zhang , Praveen Kumar Donta , Chinmaya Kumar Dehury , Xiujun Wang , Dusit Niyato , Qiyang Zhang

Stream: Design Space Exploration of Layer-Fused DNNs on Heterogeneous Dataflow Accelerators

As the landscape of deep neural networks evolves, heterogeneous dataflow accelerators, in the form of multi-core architectures or chiplet-based designs, promise more flexibility and higher inference performance through scalability. So far,…

Hardware Architecture · Computer Science 2025-10-08 Arne Symons , Linyan Mei , Steven Colleman , Pouya Houshmand , Sebastian Karl , Marian Verhelst

Exploration of Systolic-Vector Architecture with Resource Scheduling for Dynamic ML Workloads

As artificial intelligence (AI) and machine learning (ML) technologies disrupt a wide range of industries, cloud datacenters face ever-increasing demand in inference workloads. However, conventional CPU-based servers cannot handle excessive…

Hardware Architecture · Computer Science 2022-06-08 Jung-Hoon Kim , Sungyeob Yoo , Seungjae Moon , Joo-Young Kim

Real-time Hyper-Dimensional Reconfiguration at the Edge using Hardware Accelerators

In this paper we present Hyper-Dimensional Reconfigurable Analytics at the Tactical Edge (HyDRATE) using low-SWaP embedded hardware that can perform real-time reconfiguration at the edge leveraging non-MAC (free of floating-point…

Computer Vision and Pattern Recognition · Computer Science 2022-06-13 Indhumathi Kandaswamy , Saurabh Farkya , Zachary Daniels , Gooitzen van der Wal , Aswin Raghavan , Yuzheng Zhang , Jun Hu , Michael Lomnitz , Michael Isnardi , David Zhang , Michael Piacentino

DRACO: Co-Optimizing Hardware Utilization, and Performance of DNNs on Systolic Accelerator

The number of processing elements (PEs) in a fixed-sized systolic accelerator is well matched for large and compute-bound DNNs; whereas, memory-bound DNNs suffer from PE underutilization and fail to achieve peak performance and energy…

Signal Processing · Electrical Eng. & Systems 2020-06-29 Nandan Kumar Jha , Shreyas Ravishankar , Sparsh Mittal , Arvind Kaushik , Dipan Mandal , Mahesh Chandra

MARS: Exploiting Multi-Level Parallelism for DNN Workloads on Adaptive Multi-Accelerator Systems

Along with the fast evolution of deep neural networks, the hardware system is also developing rapidly. As a promising solution achieving high scalability and low manufacturing cost, multi-accelerator systems widely exist in data centers,…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-07-25 Guan Shen , Jieru Zhao , Zeke Wang , Zhe Lin , Wenchao Ding , Chentao Wu , Quan Chen , Minyi Guo

Dynamic Resource Partitioning for Multi-Tenant Systolic Array Based DNN Accelerator

Deep neural networks (DNN) have become significant applications in both cloud-server and edge devices. Meanwhile, the growing number of DNNs on those platforms raises the need to execute multiple DNNs on the same device. This paper proposes…

Hardware Architecture · Computer Science 2023-02-22 Midia Reshadi , David Gregg

Deep Adaptive Rate Allocation in Volatile Heterogeneous Wireless Networks

Modern multi-access 5G+ networks provide mobile terminals with additional capacity, improving network stability and performance. However, in highly mobile environments such as vehicular networks, supporting multi-access connectivity remains…

Information Theory · Computer Science 2026-03-24 Gregorio Maglione , Veselin Rakocevic , Markus Amend , Touraj Soleymani