Related papers: Optimizing Deep Learning Models For Raspberry Pi

Comprehensive Study on Performance Evaluation and Optimization of Model Compression: Bridging Traditional Deep Learning and Large Language Models

Deep learning models have achieved tremendous success in most of the industries in recent years. The evolution of these models has also led to an increase in the model size and energy requirement, making it difficult to deploy in production…

Machine Learning · Computer Science 2024-07-24 Aayush Saxena , Arit Kumar Bishwas , Ayush Ashok Mishra , Ryan Armstrong

Large Language Model Pruning

We surely enjoy the larger the better models for their superior performance in the last couple of years when both the hardware and software support the birth of such extremely huge models. The applied fields include text mining and others.…

Computation and Language · Computer Science 2024-06-04 Hanjuan Huang , Hao-Jia Song , Hsing-Kuo Pao

Adaptive Pruning of Neural Language Models for Mobile Devices

Neural language models (NLMs) exist in an accuracy-efficiency tradeoff space where better perplexity typically comes at the cost of greater computation complexity. In a software keyboard application on mobile devices, this translates into…

Computation and Language · Computer Science 2018-09-30 Raphael Tang , Jimmy Lin

To prune, or not to prune: exploring the efficacy of pruning for model compression

Model pruning seeks to induce sparsity in a deep neural network's various connection matrices, thereby reducing the number of nonzero-valued parameters in the model. Recent reports (Han et al., 2015; Narang et al., 2017) prune deep networks…

Machine Learning · Statistics 2017-11-15 Michael Zhu , Suyog Gupta

Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems

Deep learning recommendation systems at scale have provided remarkable gains through increasing model capacity (i.e. wider and deeper neural networks), but it comes at significant training cost and infrastructure cost. Model pruning is an…

Information Retrieval · Computer Science 2021-05-05 Xiaocong Du , Bhargav Bhushanam , Jiecao Yu , Dhruv Choudhary , Tianxiang Gao , Sherman Wong , Louis Feng , Jongsoo Park , Yu Cao , Arun Kejariwal

Principled Approximation Methods for Efficient and Scalable Deep Learning

Recent progress in deep learning has been driven by increasingly larger models. However, their computational and energy demands have grown proportionally, creating significant barriers to their deployment and to a wider adoption of deep…

Machine Learning · Computer Science 2025-09-16 Pedro Savarese

Optimizing Dense Feed-Forward Neural Networks

Deep learning models have been widely used during the last decade due to their outstanding learning and abstraction capacities. However, one of the main challenges any scientist has to face using deep learning models is to establish the…

Machine Learning · Computer Science 2025-04-22 Luis Balderas , Miguel Lastra , José M. Benítez

Deep Learning Models on CPUs: A Methodology for Efficient Training

GPUs have been favored for training deep learning models due to their highly parallelized architecture. As a result, most studies on training optimization focus on GPUs. There is often a trade-off, however, between cost and efficiency when…

Machine Learning · Computer Science 2023-06-21 Quchen Fu , Ramesh Chukka , Keith Achorn , Thomas Atta-fosu , Deepak R. Canchi , Zhongwei Teng , Jules White , Douglas C. Schmidt

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of…

Computer Vision and Pattern Recognition · Computer Science 2020-03-26 Abhinav Goel , Caleb Tung , Yung-Hsiang Lu , George K. Thiruvathukal

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

State-of-the-art deep learning models have a parameter count that reaches into the billions. Training, storing and transferring such models is energy and time consuming, thus costly. A big part of these costs is caused by training the…

Machine Learning · Computer Science 2023-05-26 Paul Wimmer , Jens Mehnert , Alexandru Paul Condurache

EvoPruneDeepTL: An Evolutionary Pruning Model for Transfer Learning based Deep Neural Networks

In recent years, Deep Learning models have shown a great performance in complex optimization problems. They generally require large training datasets, which is a limitation in most practical cases. Transfer learning allows importing the…

Neural and Evolutionary Computing · Computer Science 2024-02-06 Javier Poyatos , Daniel Molina , Aritz. D. Martinez , Javier Del Ser , Francisco Herrera

Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks

The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as…

Machine Learning · Computer Science 2021-02-02 Torsten Hoefler , Dan Alistarh , Tal Ben-Nun , Nikoli Dryden , Alexandra Peste

Budget-Aware Pruning for Multi-Domain Learning

Deep learning has achieved state-of-the-art performance on several computer vision tasks and domains. Nevertheless, it still has a high computational cost and demands a significant amount of parameters. Such requirements hinder the use in…

Computer Vision and Pattern Recognition · Computer Science 2023-09-19 Samuel Felipe dos Santos , Rodrigo Berriel , Thiago Oliveira-Santos , Nicu Sebe , Jurandy Almeida

Differentiable Network Pruning for Microcontrollers

Embedded and personal IoT devices are powered by microcontroller units (MCUs), whose extreme resource scarcity is a major obstacle for applications relying on on-device deep learning inference. Orders of magnitude less storage, memory and…

Machine Learning · Computer Science 2022-12-09 Edgar Liberis , Nicholas D. Lane

Can pruning make Large Language Models more efficient?

Transformer models have revolutionized natural language processing with their unparalleled ability to grasp complex contextual relationships. However, the vast number of parameters in these models has raised concerns regarding computational…

Machine Learning · Computer Science 2023-10-10 Sia Gholami , Marwan Omar

Accelerating Deep Learning with Dynamic Data Pruning

Deep learning's success has been attributed to the training of large, overparameterized models on massive amounts of data. As this trend continues, model training has become prohibitively costly, requiring access to powerful computing…

Machine Learning · Computer Science 2021-11-25 Ravi S Raju , Kyle Daruwalla , Mikko Lipasti

Integrating Fairness and Model Pruning Through Bi-level Optimization

Deep neural networks have achieved exceptional results across a range of applications. As the demand for efficient and sparse deep learning models escalates, the significance of model compression, particularly pruning, is increasingly…

Machine Learning · Computer Science 2025-04-01 Yucong Dai , Gen Li , Feng Luo , Xiaolong Ma , Yongkai Wu

A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA

With ever-increasing application of machine learning models in various domains such as image classification, speech recognition and synthesis, and health care, designing efficient hardware for these models has gained a lot of popularity.…

Machine Learning · Computer Science 2018-01-22 Mahdi Nazemi , Amir Erfan Eshratifar , Massoud Pedram

Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions

Pruning is a promising approach to compress deep learning models in order to deploy them on resource-constrained edge devices. However, many existing pruning solutions are based on unstructured pruning, which yields models that cannot…

Machine Learning · Computer Science 2023-03-16 Kaiqi Zhao , Animesh Jain , Ming Zhao

DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization

Large language models (LLMs) deliver impressive results but face challenges from increasing model sizes and computational costs. Structured pruning reduces model size and speeds up inference but often causes uneven degradation across…

Computation and Language · Computer Science 2025-05-28 Hexuan Deng , Wenxiang Jiao , Xuebo Liu , Jing Li , Min Zhang , Zhaopeng Tu