English
Related papers

Related papers: A Comprehensive Survey of Compression Algorithms f…

200 papers

Transformer based large language models have achieved tremendous success. However, the significant memory and computational costs incurred during the inference process make it challenging to deploy large models on resource-constrained…

Computation and Language · Computer Science 2024-02-16 Wenxiao Wang , Wei Chen , Yicong Luo , Yongliu Long , Zhengkai Lin , Liye Zhang , Binbin Lin , Deng Cai , Xiaofei He

Large Language Models (LLMs) have transformed natural language processing tasks successfully. Yet, their large size and high computational needs pose challenges for practical use, especially in resource-limited settings. Model compression…

Computation and Language · Computer Science 2024-07-31 Xunyu Zhu , Jian Li , Yong Liu , Can Ma , Weiping Wang

Compressing large language models (LLMs), often consisting of billions of parameters, provides faster inference, smaller memory footprints, and enables local deployment. Two standard compression techniques are pruning and quantization, with…

Computation and Language · Computer Science 2023-12-05 Satya Sai Srinath Namburi , Makesh Sreedhar , Srinath Srinivasan , Frederic Sala

Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various…

Software Engineering · Computer Science 2024-12-19 Giordano d'Aloisio , Luca Traini , Federica Sarro , Antinisca Di Marco

Deep learning models have achieved tremendous success in most of the industries in recent years. The evolution of these models has also led to an increase in the model size and energy requirement, making it difficult to deploy in production…

Machine Learning · Computer Science 2024-07-24 Aayush Saxena , Arit Kumar Bishwas , Ayush Ashok Mishra , Ryan Armstrong

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity. However, the huge improvement in the performance of these models is highly correlated with their increasing level of complexity,…

Computer Vision and Pattern Recognition · Computer Science 2024-01-19 Eduarda Caldeira , Pedro C. Neto , Marco Huber , Naser Damer , Ana F. Sequeira

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This…

Machine Learning · Computer Science 2021-09-07 Arhum Ishtiaq , Sara Mahmood , Maheen Anees , Neha Mumtaz

Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV), specially for constructing large language models (LLM) and large vision models (LVM). Model compression methods reduce the memory…

Machine Learning · Computer Science 2024-04-09 Yehui Tang , Yunhe Wang , Jianyuan Guo , Zhijun Tu , Kai Han , Hailin Hu , Dacheng Tao

Leveraging large language models (LLMs) for complex natural language tasks typically requires long-form prompts to convey detailed requirements and information, which results in increased memory usage and inference costs. To mitigate these…

Computation and Language · Computer Science 2024-10-18 Zongqian Li , Yinhong Liu , Yixuan Su , Nigel Collier

In this paper, we combine two-step knowledge distillation, structured pruning, truncation, and vocabulary trimming for extremely compressing multilingual encoder-only language models for low-resource languages. Our novel approach…

Computation and Language · Computer Science 2025-11-07 Daniil Gurgurov , Michal Gregor , Josef van Genabith , Simon Ostermann

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly,…

Computation and Language · Computer Science 2021-03-30 Ziheng Wang , Jeremy Wohlwend , Tao Lei

Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications.…

Computation and Language · Computer Science 2019-04-09 Artem M. Grachev , Dmitry I. Ignatov , Andrey V. Savchenko

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

The internal structure and operation mechanism of large-scale language models are analyzed theoretically, especially how Transformer and its derivative architectures can restrict computing efficiency while capturing long-term dependencies.…

Machine Learning · Computer Science 2024-05-21 Taiyuan Mei , Yun Zi , Xiaohan Cheng , Zijun Gao , Qi Wang , Haowei Yang

Large language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs…

Computation and Language · Computer Science 2023-12-07 Huiqiang Jiang , Qianhui Wu , Chin-Yew Lin , Yuqing Yang , Lili Qiu

Large language models are ubiquitous in natural language processing because they can adapt to new tasks without retraining. However, their sheer scale and complexity present unique challenges and opportunities, prompting researchers and…

Computation and Language · Computer Science 2024-08-07 Leo Donisch , Sigurd Schacht , Carsten Lanquillon

Large Language Models are growing in size, and we expect them to continue to do so, as larger models train quicker. However, this increase in size will severely impact inference costs. Therefore model compression is important, to retain the…

Machine Learning · Computer Science 2024-04-10 Georgy Tyukin

Large Language Models (LLMs) have revolutionized many areas of artificial intelligence (AI), but their substantial resource requirements limit their deployment on mobile and edge devices. This survey paper provides a comprehensive overview…

Machine Learning · Computer Science 2025-09-03 Sanjay Surendranath Girija , Shashank Kapoor , Lakshit Arora , Dipen Pradhan , Aman Raj , Ankit Shetgaonkar

Large Language Models (LLMs) demonstrate exceptional reasoning abilities, enabling strong generalization across diverse tasks such as commonsense reasoning and instruction following. However, as LLMs scale, inference costs become…

Computation and Language · Computer Science 2025-02-06 Rhea Sanjay Sukthanker , Benedikt Staffler , Frank Hutter , Aaron Klein
‹ Prev 1 2 3 10 Next ›