English
Related papers

Related papers: Hyper-Compression: Model Compression via Hyperfunc…

200 papers

Overparameterized models have proven to be powerful tools for solving various machine learning tasks. However, overparameterization often leads to a substantial increase in computational and memory costs, which in turn requires extensive…

Machine Learning · Computer Science 2024-03-13 Soo Min Kwon , Zekai Zhang , Dogyoon Song , Laura Balzano , Qing Qu

Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications.…

Computation and Language · Computer Science 2019-04-09 Artem M. Grachev , Dmitry I. Ignatov , Andrey V. Savchenko

The outstanding performance of large foundational models across diverse tasks, from computer vision to speech and natural language processing, has significantly increased their demand. However, storing and transmitting these models poses…

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This…

Machine Learning · Computer Science 2021-09-07 Arhum Ishtiaq , Sara Mahmood , Maheen Anees , Neha Mumtaz

We introduce model folding, a novel data-free model compression technique that merges structurally similar neurons across layers, significantly reducing the model size without the need for fine-tuning or access to training data. Unlike…

Machine Learning · Computer Science 2025-08-13 Dong Wang , Haris Šikić , Lothar Thiele , Olga Saukh

While the numerous parameters in Large Language Models (LLMs) contribute to their superior performance, this massive scale makes them inefficient and memory-hungry. Thus, they are hard to deploy on commodity hardware, such as one single…

Computation and Language · Computer Science 2023-10-11 Zhaozhuo Xu , Zirui Liu , Beidi Chen , Yuxin Tang , Jue Wang , Kaixiong Zhou , Xia Hu , Anshumali Shrivastava

The excellent performance of deep neural networks is usually accompanied by a large number of parameters and computations, which have limited their usage on the resource-limited edge devices. To address this issue, abundant methods such as…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Muzhou Yu , Linfeng Zhang , Kaisheng Ma

Deep neural networks have delivered remarkable performance and have been widely used in various visual tasks. However, their huge size causes significant inconvenience for transmission and storage. Many previous studies have explored model…

Computer Vision and Pattern Recognition · Computer Science 2023-08-09 Yumeng Shi , Shihao Bai , Xiuying Wei , Ruihao Gong , Jianlei Yang

Large Language Models are growing in size, and we expect them to continue to do so, as larger models train quicker. However, this increase in size will severely impact inference costs. Therefore model compression is important, to retain the…

Machine Learning · Computer Science 2024-04-10 Georgy Tyukin

Deep learning models have achieved tremendous success in most of the industries in recent years. The evolution of these models has also led to an increase in the model size and energy requirement, making it difficult to deploy in production…

Machine Learning · Computer Science 2024-07-24 Aayush Saxena , Arit Kumar Bishwas , Ayush Ashok Mishra , Ryan Armstrong

Deep learning models incorporating linear SSMs have gained attention for capturing long-range dependencies in sequential data. However, their large parameter sizes pose challenges for deployment on resource-constrained devices. In this…

Machine Learning · Computer Science 2025-07-31 Hiroki Sakamoto , Kazuhiro Sato

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity,…

Computer Vision and Pattern Recognition · Computer Science 2024-01-19 Eduarda Caldeira , Pedro C. Neto , Marco Huber , Naser Damer , Ana F. Sequeira

Large language models (LLMs) exhibit excellent performance in various tasks. However, the memory requirements of LLMs present a great challenge when deploying on memory-limited devices, even for quantized LLMs. This paper introduces a…

Computation and Language · Computer Science 2025-02-24 Weilan Wang , Yu Mao , Dongdong Tang , Hongchao Du , Nan Guan , Chun Jason Xue

In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank…

Computation and Language · Computer Science 2025-02-25 Yixin Ji , Yang Xiang , Juntao Li , Qingrong Xia , Zi Ye , Xinyu Duan , Zhefeng Wang , Kehai Chen , Min Zhang

We introduce compression laws for language language models (LLMs). While recent scaling laws have sought to understand how LLMs scale with respect to model size, pre-training data, and computational resources, we focus on understanding how…

Computation and Language · Computer Science 2025-04-08 Ayan Sengupta , Siddhant Chaudhary , Tanmoy Chakraborty

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting,…

Computation and Language · Computer Science 2020-06-24 Zhuohan Li , Eric Wallace , Sheng Shen , Kevin Lin , Kurt Keutzer , Dan Klein , Joseph E. Gonzalez

Large language models have steadily increased in size to achieve improved performance; however, this growth has also led to greater inference time and computational demands. Consequently, there is rising interest in model size reduction…

While overparameterization in machine learning models offers great benefits in terms of optimization and generalization, it also leads to increased computational requirements as model sizes grow. In this work, we show that by leveraging the…

Machine Learning · Computer Science 2026-02-16 Can Yaras , Peng Wang , Laura Balzano , Qing Qu

Deep neural networks have achieved strong performance in image classification tasks due to their ability to learn complex patterns from high-dimensional data. However, their large computational and memory requirements often limit deployment…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Sai Shi
‹ Prev 1 2 3 10 Next ›