English
Related papers

Related papers: Split CNN Inference on Networked Microcontrollers

200 papers

Tiny deep learning on microcontroller units (MCUs) is challenging due to the limited memory size. We find that the memory bottleneck is due to the imbalanced memory distribution in convolutional neural network (CNN) designs: the first…

Computer Vision and Pattern Recognition · Computer Science 2024-04-04 Ji Lin , Wei-Ming Chen , Han Cai , Chuang Gan , Song Han

IoT devices based on microcontroller units (MCU) provide ultra-low power consumption and ubiquitous computation for near-sensor deep learning models (DNN). However, the memory of MCU is usually 2-3 orders of magnitude smaller than mobile…

Hardware Architecture · Computer Science 2024-06-12 Size Zheng , Renze Chen , Meng Li , Zihao Ye , Luis Ceze , Yun Liang

Tiny Machine Learning (TinyML) is a novel research field aiming at integrating Machine Learning (ML) within embedded devices with limited memory, computation, and energy. Recently, a new branch of TinyML has emerged, focusing on integrating…

Designing deep learning models for highly-constrained hardware would allow imbuing many edge devices with intelligence. Microcontrollers (MCUs) are an attractive platform for building smart devices due to their low cost, wide availability,…

Machine Learning · Computer Science 2020-03-04 Edgar Liberis , Nicholas D. Lane

AI spans from large language models to tiny models running on microcontrollers (MCUs). Extremely memory-efficient model architectures are decisive to fit within an MCU's tiny memory budget e.g., 128kB of RAM. However, inference latency must…

Machine Learning · Computer Science 2025-10-20 Zhaolan Huang , Emmanuel Baccelli

Video and image streaming on edge devices requires low latency. To address this, Neural Networks (NNs) are widely used, and prior work mainly focuses on accelerating them with single hardware units such as Graphics Processing Units (GPUs),…

Hardware Architecture · Computer Science 2026-05-04 Ali Emre Oztas , Mahir Demir , James Garside , Mikel Luj'an

The rapid growth of microcontroller-based IoT devices has opened up numerous applications, from smart manufacturing to personalized healthcare. Despite the widespread adoption of energy-efficient microcontroller units (MCUs) in the Tiny…

Machine Learning · Computer Science 2024-09-26 Giorgos Armeniakos , Georgios Mentzos , Dimitrios Soudris

The deployment of Quantized Neural Networks (QNNs) on resource-constrained edge devices, such as microcontrollers (MCUs), introduces fundamental challenges in balancing model performance, computational complexity, and memory constraints.…

Machine Learning · Computer Science 2026-01-08 Hamza A. Abushahla , Dara Varam , Ariel Justine N. Panopio , Mohamed I. AlHajri

For convolutional neural networks (CNNs) that have a large volume of input data, memory management becomes a major concern. Memory cost reduction can be an effective way to deal with these problems that can be realized through different…

Computer Vision and Pattern Recognition · Computer Science 2020-02-11 Emad MalekHosseini , Mohsen Hajabdollahi , Nader Karimi , Shadrokh Samavi , Shahram Shirani

Running neural networks (NNs) on microcontroller units (MCUs) is becoming increasingly important, but is very difficult due to the tiny SRAM size of MCU. Prior work proposes many algorithm-level techniques to reduce NN memory footprints,…

Hardware Architecture · Computer Science 2021-09-02 Hongyu Miao , Felix Xiaozhu Lin

The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment. The Internet…

Machine Learning · Computer Science 2019-05-30 Igor Fedorov , Ryan P. Adams , Matthew Mattina , Paul N. Whatmough

Deep Learning approaches based on Convolutional Neural Networks (CNNs) are extensively utilized and very successful in a wide range of application areas, including image classification and speech recognition. For the execution of trained…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-26 Xiaotian Guo , Andy D. Pimentel , Todor Stefanov

The field of Tiny Machine Learning (TinyML) has gained significant attention due to its potential to enable intelligent applications on resource-constrained devices. This review provides an in-depth analysis of the advancements in efficient…

Machine Learning · Statistics 2023-11-21 Minh Tri Lê , Pierre Wolinski , Julyan Arbel

Quantized CNN inference on ultra-low-power MCUs incurs unnecessary computations in neurons that produce saturated output values. These values are too extreme and are eventually clamped to the boundaries allowed by the neuron. Often times,…

Systems and Control · Electrical Eng. & Systems 2026-02-27 Shiming Li , Luca Mottola , Yuan Yao , Stefanos Kaxiras

In this paper, we propose different alternatives for convolutional neural networks (CNNs) segmentation, addressing inference processes on computing architectures composed by multiple Edge TPUs. Specifically, we compare the inference…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-03-04 Jorge Villarrubia , Luis Costero , Francisco D. Igual , Katzalin Olcoz

In this paper, we introduce a memory-efficient CNN (convolutional neural network), which enables resource-constrained low-end embedded and IoT devices to perform on-device vision tasks, such as image classification and object detection,…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Jaewook Lee , Yoel Park , Seulki Lee

The popularity of Convolutional Neural Network (CNN) models and the ubiquity of CPUs imply that better performance of CNN model inference on CPUs can deliver significant gain to a large number of users. To improve the performance of CNN…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-07-09 Yizhi Liu , Yao Wang , Ruofei Yu , Mu Li , Vin Sharma , Yida Wang

Though CNNs are highly parallel workloads, in the absence of efficient on-chip memory reuse techniques, an accelerator for them quickly becomes memory bound. In this paper, we propose a CNN accelerator design for inference that is able to…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-26 Kingshuk Majumder , Shubham Nema , Uday Bondhugula

Executing machine learning workloads locally on resource constrained microcontrollers (MCUs) promises to drastically expand the application space of IoT. However, so-called TinyML presents severe technical challenges, as deep neural network…

Machine learning on tiny IoT devices based on microcontroller units (MCU) is appealing but challenging: the memory of microcontrollers is 2-3 orders of magnitude smaller even than mobile phones. We propose MCUNet, a framework that jointly…

Computer Vision and Pattern Recognition · Computer Science 2020-11-20 Ji Lin , Wei-Ming Chen , Yujun Lin , John Cohn , Chuang Gan , Song Han
‹ Prev 1 2 3 10 Next ›