Related papers: Tuning Algorithms and Generators for Efficient Edg…

Scaling Up Deep Neural Network Optimization for Edge Inference

Deep neural networks (DNNs) have been increasingly deployed on and integrated with edge devices, such as mobile phones, drones, robots and wearables. To run DNN inference directly on edge devices (a.k.a. edge inference) with a satisfactory…

Machine Learning · Computer Science 2020-09-18 Bingqian Lu , Jianyi Yang , Shaolei Ren

Latency optimized Deep Neural Networks (DNNs): An Artificial Intelligence approach at the Edge using Multiprocessor System on Chip (MPSoC)

Almost in every heavily computation-dependent application, from 6G communication systems to autonomous driving platforms, a large portion of computing should be near to the client side. Edge computing (AI at Edge) in mobile devices is one…

Hardware Architecture · Computer Science 2024-07-29 Seyed Nima Omidsajedi , Rekha Reddy , Jianming Yi , Jan Herbst , Christoph Lipps , Hans Dieter Schotten

A Precision-Scalable RISC-V DNN Processor with On-Device Learning Capability at the Extreme Edge

Extreme edge platforms, such as in-vehicle smart devices, require efficient deployment of quantized deep neural networks (DNNs) to enable intelligent applications with limited amounts of energy, memory, and computing resources. However,…

Hardware Architecture · Computer Science 2024-03-28 Longwei Huang , Chao Fang , Qiong Li , Jun Lin , Zhongfeng Wang

SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference

Edge computing devices inherently face tight resource constraints, which is especially apparent when deploying Deep Neural Networks (DNN) with high memory and compute demands. FPGAs are commonly available in edge devices. Since these…

Hardware Architecture · Computer Science 2021-10-04 Jude Haris , Perry Gibson , José Cano , Nicolas Bohm Agostini , David Kaeli

Partitioning and Deployment of Deep Neural Networks on Edge Clusters

Edge inference has become more widespread, as its diverse applications range from retail to wearable technology. Clusters of networked resource-constrained edge devices are becoming common, yet no system exists to split a DNN across these…

Networking and Internet Architecture · Computer Science 2023-04-25 Arjun Parthasarathy , Bhaskar Krishnamachari

Intelligence Beyond the Edge: Inference on Intermittent Embedded Systems

Energy-harvesting technology provides a promising platform for future IoT applications. However, since communication is very expensive in these devices, applications will require inference "beyond the edge" to avoid wasting precious energy…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-04 Graham Gobieski , Nathan Beckmann , Brandon Lucia

A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference

RISC-V processors encounter substantial challenges in deploying multi-precision deep neural networks (DNNs) due to their restricted precision support, constrained throughput, and suboptimal dataflow design. To tackle these challenges, a…

Hardware Architecture · Computer Science 2024-07-16 Chuanning Wang , Chao Fang , Xiao Wu , Zhongfeng Wang , Jun Lin

A 3 TOPS/W RISC-V Parallel Cluster for Inference of Fine-Grain Mixed-Precision Quantized Neural Networks

The emerging trend of deploying complex algorithms, such as Deep Neural Networks (DNNs), increasingly poses strict memory and energy efficiency requirements on Internet-of-Things (IoT) end-nodes. Mixed-precision quantization has been…

Hardware Architecture · Computer Science 2023-07-04 Alessandro Nadalini , Georg Rutishauser , Alessio Burrello , Nazareno Bruschi , Angelo Garofalo , Luca Benini , Francesco Conti , Davide Rossi

MaRVIn: A Cross-Layer Mixed-Precision RISC-V Framework for DNN Inference, from ISA Extension to Hardware Acceleration

The evolution of quantization and mixed-precision techniques has unlocked new possibilities for enhancing the speed and energy efficiency of NNs. Several recent studies indicate that adapting precision levels across different parameters can…

Machine Learning · Computer Science 2025-09-19 Giorgos Armeniakos , Alexis Maras , Sotirios Xydis , Dimitrios Soudris

Accelerating Training of Deep Neural Networks via Sparse Edge Processing

We propose a reconfigurable hardware architecture for deep neural networks (DNNs) capable of online training and inference, which uses algorithmically pre-determined, structured sparsity to significantly lower memory and computational…

Neural and Evolutionary Computing · Computer Science 2017-11-07 Sourya Dey , Yinan Shao , Keith M. Chugg , Peter A. Beerel

BCEdge: SLO-Aware DNN Inference Services with Adaptive Batching on Edge Platforms

As deep neural networks (DNNs) are being applied to a wide range of edge intelligent applications, it is critical for edge inference platforms to have both high-throughput and low-latency at the same time. Such edge platforms with multiple…

Machine Learning · Computer Science 2023-05-03 Ziyang Zhang , Huan Li , Yang Zhao , Changyao Lin , Jie Liu

SoC-Tuner: An Importance-guided Exploration Framework for DNN-targeting SoC Design

Designing a system-on-chip (SoC) for deep neural network (DNN) acceleration requires balancing multiple metrics such as latency, power, and area. However, most existing methods ignore the interactions among different SoC components and rely…

Hardware Architecture · Computer Science 2023-12-20 Shixin Chen , Su Zheng , Chen Bai , Wenqian Zhao , Shuo Yin , Yang Bai , Bei Yu

SPEED: A Scalable RISC-V Vector Processor Enabling Efficient Multi-Precision DNN Inference

Deploying deep neural networks (DNNs) on those resource-constrained edge platforms is hindered by their substantial computation and storage demands. Quantized multi-precision DNNs, denoted as MP-DNNs, offer a promising solution for these…

Hardware Architecture · Computer Science 2024-10-10 Chuanning Wang , Chao Fang , Xiao Wu , Zhongfeng Wang , Jun Lin

Scheduling Inference Workloads on Distributed Edge Clusters with Reinforcement Learning

Many real-time applications (e.g., Augmented/Virtual Reality, cognitive assistance) rely on Deep Neural Networks (DNNs) to process inference tasks. Edge computing is considered a key infrastructure to deploy such applications, as moving…

Machine Learning · Computer Science 2023-02-01 Gabriele Castellano , Juan-José Nieto , Jordi Luque , Ferrán Diego , Carlos Segura , Diego Perino , Flavio Esposito , Fulvio Risso , Aravindh Raman

Towards Intelligent Edge Sensing for ISCC Network: Joint Multi-Tier DNN Partitioning and Beamforming Design

The combination of Integrated Sensing and Communication (ISAC) and Mobile Edge Computing (MEC) enables devices to simultaneously sense the environment and offload data to the base stations (BS) for intelligent processing, thereby reducing…

Signal Processing · Electrical Eng. & Systems 2025-05-01 Peng Liu , Zesong Fei , Xinyi Wang , Xiaoyang Li , Weijie Yuan , Yuanhao Li , Cheng Hu , Dusit Niyato

A Reconfigurable Multiplier Architecture for Error-Resilient Applications in RISC-V Core

Neural Networks (NNs) have been widely adopted due to their outstanding efficacy and adaptability across computer vision and deep learning applications. The optimization of NNs is necessary to enable their deployment on energy constrained…

Hardware Architecture · Computer Science 2026-05-12 Pragun Jaswal , L. Hemanth Krishna , B. Srinivasu

Hardware/Software co-design with ADC-Less In-memory Computing Hardware for Spiking Neural Networks

Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard…

Neural and Evolutionary Computing · Computer Science 2023-09-26 Marco Paul E. Apolinario , Adarsh Kumar Kosta , Utkarsh Saxena , Kaushik Roy

Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing

As a key technology of enabling Artificial Intelligence (AI) applications in 5G era, Deep Neural Networks (DNNs) have quickly attracted widespread attention. However, it is challenging to run computation-intensive DNN-based tasks on mobile…

Networking and Internet Architecture · Computer Science 2019-10-14 En Li , Liekang Zeng , Zhi Zhou , Xu Chen

PowerFlow-DNN: Compiler-Directed Fine-Grained Power Orchestration for End-to-End Edge AI Inference

Edge AI systems often operate under stringent energy and volume constraints that demand extreme efficiency under limited battery capacity, with requirements worsening as intelligent capability demands advance. Prior literature suggests that…

Hardware Architecture · Computer Science 2026-03-26 Paul Chen , Jeongeun Kim , Wenbo Zhu , Yuanhan Li , Shunyao Huang , Chenjie Weng , Christopher Torng

An efficient and flexible inference system for serving heterogeneous ensembles of deep neural networks

Ensembles of Deep Neural Networks (DNNs) have achieved qualitative predictions but they are computing and memory intensive. Therefore, the demand is growing to make them answer a heavy workload of requests with available computational…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-31 Pierrick Pochelu , Serge G. Petiton , Bruno Conche