Related papers: HAPI: Hardware-Aware Progressive Inference

HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices

During the last years, algorithms known as Convolutional Neural Networks (CNNs) had become increasingly popular, expanding its application range to several areas. In particular, the image processing field has experienced a remarkable…

Hardware Architecture · Computer Science 2024-08-27 Federico Nicolas Peccia , Luciano Ferreyro , Alejandro Furfaro

Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions

DNNs are becoming less and less over-parametrised due to recent advances in efficient model design, through careful hand-crafted or NAS-based methods. Relying on the fact that not all inputs require the same amount of computation to yield a…

Machine Learning · Computer Science 2021-06-10 Stefanos Laskaridis , Alexandros Kouris , Nicholas D. Lane

Early-exit Convolutional Neural Networks

This paper is aimed at developing a method that reduces the computational cost of convolutional neural networks (CNN) during inference. Conventionally, the input data pass through a fixed neural network architecture. However, easy examples…

Computer Vision and Pattern Recognition · Computer Science 2024-09-10 Edanur Demir , Emre Akbas

Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices

Deep neural networks have significantly improved performance on a range of tasks with the increasing demand for computational resources, leaving deployment on low-resource devices (with limited memory and battery power) infeasible. Binary…

Machine Learning · Computer Science 2022-06-22 Aaqib Saeed

Resource-Constrained Edge AI with Early Exit Prediction

By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce…

Machine Learning · Computer Science 2022-06-22 Rongkang Dong , Yuyi Mao , Jun Zhang

Design and Prototyping Distributed CNN Inference Acceleration in Edge Computing

For time-critical IoT applications using deep learning, inference acceleration through distributed computing is a promising approach to meet a stringent deadline. In this paper, we implement a working prototype of a new distributed…

Computer Vision and Pattern Recognition · Computer Science 2022-11-29 Zhongtian Dong , Nan Li , Alexandros Iosifidis , Qi Zhang

HENet:A Highly Efficient Convolutional Neural Networks Optimized for Accuracy, Speed and Storage

In order to enhance the real-time performance of convolutional neural networks(CNNs), more and more researchers are focusing on improving the efficiency of CNN. Based on the analysis of some CNN architectures, such as ResNet, DenseNet,…

Computer Vision and Pattern Recognition · Computer Science 2018-03-16 Qiuyu Zhu , Ruixin Zhang

Convolutional Neural Network and Transfer Learning for High Impedance Fault Detection

This letter presents a novel high impedance fault (HIF) detection approach using a convolutional neural network (CNN). Compared to traditional artificial neural networks, a CNN offers translation invariance and it can accurately detect HIFs…

Signal Processing · Electrical Eng. & Systems 2019-04-19 Rui Fan , Tianzhixi Yin

A Survey of Early Exit Deep Neural Networks in NLP

Deep Neural Networks (DNNs) have grown increasingly large in size to achieve state of the art performance across a wide range of tasks. However, their high computational requirements make them less suitable for resource-constrained…

Machine Learning · Computer Science 2025-01-15 Divya Jyoti Bajpai , Manjesh Kumar Hanawal

Early-Exit with Class Exclusion for Efficient Inference of Neural Networks

Deep neural networks (DNNs) have been successfully applied in various fields. In DNNs, a large number of multiply-accumulate (MAC) operations are required to be performed, posing critical challenges in applying them in resource-constrained…

Machine Learning · Computer Science 2024-02-20 Jingcun Wang , Bing Li , Grace Li Zhang

HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Networks

Convolutional neural network (CNN) inference using fully homomorphic encryption (FHE) is a promising private inference (PI) solution due to the capability of FHE that enables offloading the whole computation process to the server while…

Cryptography and Security · Computer Science 2024-01-02 Donghwan Kim , Jaiyoung Park , Jongmin Kim , Sangpyo Kim , Jung Ho Ahn

Optimizing CNN Using HPC Tools

This paper optimizes the Convolutional Neural Network (CNN) algorithm using high-performance computing (HPC) technologies. It uses multi-core processors, GPUs, and parallel computing frameworks like OpenMPI and CUDA to speed up CNN model…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-03-11 Shahrin Rahman

Early-Exit meets Model-Distributed Inference at Edge Networks

Distributed inference techniques can be broadly classified into data-distributed and model-distributed schemes. In data-distributed inference (DDI), each worker carries the entire deep neural network (DNN) model but processes only a subset…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-13 Marco Colocrese , Erdem Koyuncu , Hulya Seferoglu

SoWaF: Shuffling of Weights and Feature Maps: A Novel Hardware Intrinsic Attack (HIA) on Convolutional Neural Network (CNN)

Security of inference phase deployment of Convolutional neural network (CNN) into resource constrained embedded systems (e.g. low end FPGAs) is a growing research area. Using secure practices, third party FPGA designers can be provided with…

Cryptography and Security · Computer Science 2021-07-14 Tolulope A. Odetola , Syed Rafay Hasan

Rethinking Calibration for Early-Exit Neural Networks

Early-exit neural networks (EENNs) accelerate inference by allowing intermediate classifiers to stop computation once predictions are confident enough. Most methods rely on confidence thresholds for exiting, and consequently, improving…

Machine Learning · Computer Science 2026-05-28 Piotr Kubaty , Filip Szatkowski , Grzegorz Choczyński , Eric Nalisnick , Bartosz Wójcik

Adaptive Deep Neural Network Inference Optimization with EENet

Well-trained deep neural networks (DNNs) treat all test samples equally during prediction. Adaptive DNN inference with early exiting leverages the observation that some test examples can be easier to predict than others. This paper presents…

Machine Learning · Computer Science 2023-12-04 Fatih Ilhan , Ka-Ho Chow , Sihao Hu , Tiansheng Huang , Selim Tekin , Wenqi Wei , Yanzhao Wu , Myungjin Lee , Ramana Kompella , Hugo Latapie , Gaowen Liu , Ling Liu

H2PIPE: High throughput CNN Inference on FPGAs with High-Bandwidth Memory

Convolutional Neural Networks (CNNs) combine large amounts of parallelizable computation with frequent memory access. Field Programmable Gate Arrays (FPGAs) can achieve low latency and high throughput CNN inference by implementing dataflow…

Hardware Architecture · Computer Science 2024-08-20 Mario Doumet , Marius Stan , Mathew Hall , Vaughn Betz

HAT: Hardware-Aware Transformers for Efficient Natural Language Processing

Transformers are ubiquitous in Natural Language Processing (NLP) tasks, but they are difficult to be deployed on hardware due to the intensive computation. To enable low-latency inference on resource-constrained hardware platforms, we…

Computation and Language · Computer Science 2024-04-05 Hanrui Wang , Zhanghao Wu , Zhijian Liu , Han Cai , Ligeng Zhu , Chuang Gan , Song Han

HPIPE: Heterogeneous Layer-Pipelined and Sparse-Aware CNN Inference for FPGAs

We present both a novel Convolutional Neural Network (CNN) accelerator architecture and a network compiler for FPGAs that outperforms all prior work. Instead of having generic processing elements that together process one layer at a time,…

Hardware Architecture · Computer Science 2020-07-22 Mathew Hall , Vaughn Betz

Hardware-Algorithm Co-Optimization of Early-Exit Neural Networks for Multi-Core Edge Accelerators

Deployment of dynamic neural networks on edge accelerators requires careful consideration of hardware constraints beyond conventional complexity metrics such as Multiply-Accumulate operations. In Early-Exiting Neural Networks (EENN), exit…

Computational Complexity · Computer Science 2026-04-01 Alaa Zniber , Arne Symons , Ouassim Karrakchou , Marian Verhelst , Mounir Ghogho