Related papers: On Accelerating Edge AI: Optimizing Resource-Const…

3U-EdgeAI: Ultra-Low Memory Training, Ultra-Low BitwidthQuantization, and Ultra-Low Latency Acceleration

The deep neural network (DNN) based AI applications on the edge require both low-cost computing platforms and high-quality services. However, the limited memory, computing resources, and power budget of the edge devices constrain the…

Machine Learning · Computer Science 2021-05-14 Yao Chen , Cole Hawkins , Kaiqi Zhang , Zheng Zhang , Cong Hao

Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques

Large Language Models (LLMs) have revolutionized many areas of artificial intelligence (AI), but their substantial resource requirements limit their deployment on mobile and edge devices. This survey paper provides a comprehensive overview…

Machine Learning · Computer Science 2025-09-03 Sanjay Surendranath Girija , Shashank Kapoor , Lakshit Arora , Dipen Pradhan , Aman Raj , Ankit Shetgaonkar

Search-time Efficient Device Constraints-Aware Neural Architecture Search

Edge computing aims to enable edge devices, such as IoT devices, to process data locally instead of relying on the cloud. However, deep learning techniques like computer vision and natural language processing can be computationally…

Computer Vision and Pattern Recognition · Computer Science 2023-07-11 Oshin Dutta , Tanu Kanvar , Sumeet Agarwal

Optimizing edge AI models on HPC systems with the edge in the loop

Artificial intelligence and machine learning models deployed on edge devices, e.g., for quality control in Additive Manufacturing (AM), are frequently small in size. Such models usually have to deliver highly accurate results within a short…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-26 Marcel Aach , Cyril Blanc , Andreas Lintermann , Kurt De Grave

Learning Performance Optimization for Edge AI System with Time and Energy Constraints

Edge AI, which brings artificial intelligence to the edge of the network for real-time processing and decision-making, has emerged as a transformative technology across various applications. However, the deployment of Edge AI systems faces…

Signal Processing · Electrical Eng. & Systems 2025-11-11 Zhiyuan Zhai , Wei Ni , Xin Wang

Enhancing Neural Architecture Search with Multiple Hardware Constraints for Deep Learning Model Deployment on Tiny IoT Devices

The rapid proliferation of computing domains relying on Internet of Things (IoT) devices has created a pressing need for efficient and accurate deep-learning (DL) models that can run on low-power devices. However, traditional DL models tend…

Machine Learning · Computer Science 2023-10-12 Alessio Burrello , Matteo Risso , Beatrice Alessandra Motetti , Enrico Macii , Luca Benini , Daniele Jahier Pagliari

Principled Approximation Methods for Efficient and Scalable Deep Learning

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep…

Machine Learning · Computer Science 2025-09-16 Pedro Savarese

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition. However, their superior performance comes at the…

Machine Learning · Computer Science 2022-04-26 Han Cai , Ji Lin , Yujun Lin , Zhijian Liu , Haotian Tang , Hanrui Wang , Ligeng Zhu , Song Han

Model-driven Cluster Resource Management for AI Workloads in Edge Clouds

Since emerging edge applications such as Internet of Things (IoT) analytics and augmented reality have tight latency constraints, hardware AI accelerators have been recently proposed to speed up deep neural network (DNN) inference run by…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-01-20 Qianlin Liang , Walid A. Hanafy , Ahmed Ali-Eldin , Prashant Shenoy

Accelerating DNN Training in Wireless Federated Edge Learning Systems

Training task in classical machine learning models, such as deep neural networks, is generally implemented at a remote cloud center for centralized learning, which is typically time-consuming and resource-hungry. It also incurs serious…

Machine Learning · Computer Science 2020-10-27 Jinke Ren , Guanding Yu , Guangyao Ding

RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices

Computing-in-memory (CIM) is an emerging computing paradigm, offering noteworthy potential for accelerating neural networks with high parallelism, low latency, and energy efficiency compared to conventional von Neumann architectures.…

Neural and Evolutionary Computing · Computer Science 2024-09-30 Kam Chi Loong , Shihao Han , Sishuo Liu , Ning Lin , Zhongrui Wang

Complexity-Driven CNN Compression for Resource-constrained Edge AI

Recent advances in Artificial Intelligence (AI) on the Internet of Things (IoT)-enabled network edge has realized edge intelligence in several applications such as smart agriculture, smart hospitals, and smart factories by enabling…

Machine Learning · Computer Science 2024-01-18 Muhammad Zawish , Steven Davy , Lizy Abraham

Resource-Efficient Generative AI Model Deployment in Mobile Edge Networks

The surging development of Artificial Intelligence-Generated Content (AIGC) marks a transformative era of the content creation and production. Edge servers promise attractive benefits, e.g., reduced service delay and backhaul traffic load,…

Machine Learning · Computer Science 2024-09-10 Yuxin Liang , Peng Yang , Yuanyuan He , Feng Lyu

Investigation of Energy-efficient AI Model Architectures and Compression Techniques for "Green" Fetal Brain Segmentation

Artificial intelligence have contributed to advancements across various industries. However, the rapid growth of artificial intelligence technologies also raises concerns about their environmental impact, due to associated carbon footprints…

Image and Video Processing · Electrical Eng. & Systems 2024-05-28 Szymon Mazurek , Monika Pytlarz , Sylwia Malec , Alessandro Crimi

Lightweight Neural Architecture Search for Temporal Convolutional Networks at the Edge

Neural Architecture Search (NAS) is quickly becoming the go-to approach to optimize the structure of Deep Learning (DL) models for complex tasks such as Image Classification or Object Detection. However, many other relevant applications of…

Machine Learning · Computer Science 2023-01-26 Matteo Risso , Alessio Burrello , Francesco Conti , Lorenzo Lamberti , Yukai Chen , Luca Benini , Enrico Macii , Massimo Poncino , Daniele Jahier Pagliari

What Happens on the Edge, Stays on the Edge: Toward Compressive Deep Learning

Machine learning at the edge offers great benefits such as increased privacy and security, low latency, and more autonomy. However, a major challenge is that many devices, in particular edge devices, have very limited memory, weak…

Machine Learning · Computer Science 2019-09-05 Yang Li , Thomas Strohmer

Deployment-Aligned Low-Precision Neural Architecture Search for Spaceborne Edge AI

Designing deep networks that meet strict latency and accuracy constraints on edge accelerators increasingly relies on hardware-aware optimization, including neural architecture search (NAS) guided by device-level metrics. Yet most…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Parampuneet Kaur Thind , Vaibhav Katturu , Giacomo Zema , Roberto Del Prete

U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search

Optimizing resource utilization in target platforms is key to achieving high performance during DNN inference. While optimizations have been proposed for inference latency, memory footprint, and energy consumption, prior hardware-aware…

Machine Learning · Computer Science 2022-03-24 Ahmet Caner Yüzügüler , Nikolaos Dimitriadis , Pascal Frossard

Accelerate Intermittent Deep Inference

Emerging research in edge devices and micro-controller units (MCU) enables on-device computation of Deep Learning Training and Inferencing tasks. More recently, contemporary trends focus on making the Deep Neural Net (DNN) Models runnable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-01-30 Ziliang Zhang

Efficient Training Under Limited Resources

Training time budget and size of the dataset are among the factors affecting the performance of a Deep Neural Network (DNN). This paper shows that Neural Architecture Search (NAS), Hyper Parameters Optimization (HPO), and Data Augmentation…

Machine Learning · Computer Science 2023-01-24 Mahdi Zolnouri , Dounia Lakhmiri , Christophe Tribes , Eyyüb Sari , Sébastien Le Digabel