English
Related papers

Related papers: A Scalable RISC-V Vector Processor Enabling Effici…

200 papers

Deploying deep neural networks (DNNs) on those resource-constrained edge platforms is hindered by their substantial computation and storage demands. Quantized multi-precision DNNs, denoted as MP-DNNs, offer a promising solution for these…

Hardware Architecture · Computer Science 2024-10-10 Chuanning Wang , Chao Fang , Xiao Wu , Zhongfeng Wang , Jun Lin

Handling vast amounts of data is crucial in today's world. The growth of high-performance computing has created a need for parallelization, particularly in the area of machine learning algorithms such as ANN (Approximate Nearest Neighbors).…

Machine Learning · Computer Science 2024-07-19 Konstantin Rumyantsev , Pavel Yakovlev , Andrey Gorshkov , Andrey P. Sokolov

Extreme edge platforms, such as in-vehicle smart devices, require efficient deployment of quantized deep neural networks (DNNs) to enable intelligent applications with limited amounts of energy, memory, and computing resources. However,…

Hardware Architecture · Computer Science 2024-03-28 Longwei Huang , Chao Fang , Qiong Li , Jun Lin , Zhongfeng Wang

The proliferation of edge devices necessitates efficient computational architectures for lightweight tasks, particularly deep neural network (DNN) inference. Traditional NPUs, though effective for such operations, face challenges in power,…

Hardware Architecture · Computer Science 2024-07-04 Won Hyeok Kim , Hyeong Jin Kim , Tae Hee Han

Low bit-width Quantized Neural Networks (QNNs) enable deployment of complex machine learning models on constrained devices such as microcontrollers (MCUs) by reducing their memory footprint. Fine-grained asymmetric quantization (i.e.,…

Hardware Architecture · Computer Science 2020-10-09 Gianmarco Ottavi , Angelo Garofalo , Giuseppe Tagliavini , Francesco Conti , Luca Benini , Davide Rossi

The emergence of heterogeneity and domain-specific architectures targeting deep learning inference show great potential for enabling the deployment of modern CNNs on resource-constrained embedded platforms. A significant development is the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-25 Dmitri Lyalikov

Spiking Neural Networks (SNNs) have gained significant attention in edge computing due to their low power consumption and computational efficiency. However, existing implementations either use conventional System on Chip (SoC) architectures…

Hardware Architecture · Computer Science 2026-03-13 Kanishka Gunawardana , Sanka Peeris , Kavishka Rambukwella , Thamish Wanduragala , Saadia Jameel , Roshan Ragel , Isuru Nawinne

The evolution of quantization and mixed-precision techniques has unlocked new possibilities for enhancing the speed and energy efficiency of NNs. Several recent studies indicate that adapting precision levels across different parameters can…

Machine Learning · Computer Science 2025-09-19 Giorgos Armeniakos , Alexis Maras , Sotirios Xydis , Dimitrios Soudris

The customizability of RISC-V makes it an attractive choice for accelerating deep neural networks (DNNs). It can be achieved through instruction set extensions and corresponding custom functional units. Yet, efficiently exploiting these…

Machine Learning · Computer Science 2025-04-29 Muhammad Sabih , Abrarul Karim , Jakob Wittmann , Frank Hannig , Jürgen Teich

The emerging trend of deploying complex algorithms, such as Deep Neural Networks (DNNs), increasingly poses strict memory and energy efficiency requirements on Internet-of-Things (IoT) end-nodes. Mixed-precision quantization has been…

RISC-V CPUs leverage the RVV (RISC-V Vector) extension to accelerate data-parallel workloads. In addition to arithmetic operations, RVV includes powerful permutation instructions that enable flexible element rearrangement within vector…

Hardware Architecture · Computer Science 2025-06-02 Vasileios Titopoulos , George Alexakis , Chrysostomos Nicopoulos , Giorgos Dimitrakopoulos

We present a DNN accelerator that allows inference at arbitrary precision with dedicated processing elements that are configurable at the bit level. Our DNN accelerator has 8 Processing Elements controlled by a RISC-V controller with a…

Hardware Architecture · Computer Science 2023-01-03 Mohammadhossein Askarihemmat , Sean Wagner , Olexa Bilaniuk , Yassine Hariri , Yvon Savaria , Jean-Pierre David

Spiking Neural Networks (SNNs) have the potential to drastically reduce the energy requirements of AI systems. However, mainstream accelerators like GPUs and TPUs are designed for the high arithmetic intensity of standard ANNs so are not…

Neural and Evolutionary Computing · Computer Science 2025-07-15 Zainab Aizaz , James C. Knight , Thomas Nowotny

A surge in artificial intelligence and autonomous technologies have increased the demand toward enhanced edge-processing capabilities. Computational complexity and size of state-of-the-art Deep Neural Networks (DNNs) are rising…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-05-12 Rawan Naous , Lazar Supic , Yoonhwan Kang , Ranko Sredojevic , Anish Singhani , Vladimir Stojanovic

Convolutional Neural Networks (CNNs) are used in a wide range of applications, with full-precision CNNs achieving high accuracy at the expense of portability. Recent progress in quantization techniques has demonstrated that sub-byte…

RISC-V provides a flexible and scalable platform for applications ranging from embedded devices to high-performance computing clusters. Particularly, its RISC-V Vector Extension (RVV) becomes of interest for the acceleration of AI…

Machine Learning · Computer Science 2025-08-20 Federico Nicolas Peccia , Frederik Haxel , Oliver Bringmann

The increasing demand for on-device intelligence in Edge AI and TinyML applications requires the efficient execution of modern Convolutional Neural Networks (CNNs). While lightweight architectures like MobileNetV2 employ Depthwise Separable…

Hardware Architecture · Computer Science 2025-11-27 Muhammed Yildirim , Ozcan Ozturk

Neural Networks (NNs) have been widely adopted due to their outstanding efficacy and adaptability across computer vision and deep learning applications. The optimization of NNs is necessary to enable their deployment on energy constrained…

Hardware Architecture · Computer Science 2026-05-12 Pragun Jaswal , L. Hemanth Krishna , B. Srinivasu

Machine learning based on neural networks has advanced rapidly, but the high energy consumption required for training and inference remains a major challenge. Hyperdimensional Computing (HDC) offers a lightweight, brain-inspired alternative…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-10 Wakuto Matsumi , Riaz-Ul-Haque Mian

This work introduces lightweight extensions to the RISC-V ISA to boost the efficiency of heavily Quantized Neural Network (QNN) inference on microcontroller-class cores. By extending the ISA with nibble (4-bit) and crumb (2-bit) SIMD…

Hardware Architecture · Computer Science 2020-12-01 Angelo Garofalo , Giuseppe Tagliavini , Francesco Conti , Luca Benini , Davide Rossi
‹ Prev 1 2 3 10 Next ›