Related papers: Revisiting Data Compression with Language Modeling

A Survey on Model Compression for Large Language Models

Large Language Models (LLMs) have transformed natural language processing tasks successfully. Yet, their large size and high computational needs pose challenges for practical use, especially in resource-limited settings. Model compression…

Computation and Language · Computer Science 2024-07-31 Xunyu Zhu , Jian Li , Yong Liu , Can Ma , Weiping Wang

Lossless data compression by large models

Modern data compression methods are slowly reaching their limits after 80 years of research, millions of papers, and wide range of applications. Yet, the extravagant 6G communication speed requirement raises a major open question for…

Information Theory · Computer Science 2025-05-01 Ziguang Li , Chao Huang , Xuliang Wang , Haibo Hu , Cole Wyeth , Dongbo Bu , Quan Yu , Wen Gao , Xingwu Liu , Ming Li

When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models

Large language models (LLMs) exhibit excellent performance in various tasks. However, the memory requirements of LLMs present a great challenge when deploying on memory-limited devices, even for quantized LLMs. This paper introduces a…

Computation and Language · Computer Science 2025-02-24 Weilan Wang , Yu Mao , Dongdong Tang , Hongchao Du , Nan Guan , Chun Jason Xue

Ranking LLMs by compression

We conceptualize the process of understanding as information compression, and propose a method for ranking large language models (LLMs) based on lossless data compression. We demonstrate the equivalence of compression length under…

Artificial Intelligence · Computer Science 2024-06-21 Peijia Guo , Ziguang Li , Haibo Hu , Chao Huang , Ming Li , Rui Zhang

Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need

We have recently witnessed that ``Intelligence" and `` Compression" are the two sides of the same coin, where the language large model (LLM) with unprecedented intelligence is a general-purpose lossless compressor for various data…

Computer Vision and Pattern Recognition · Computer Science 2024-11-25 Kecheng Chen , Pingping Zhang , Hui Liu , Jie Liu , Yibing Liu , Jiaxin Huang , Shiqi Wang , Hong Yan , Haoliang Li

Semantic Compression With Large Language Models

The rise of large language models (LLMs) is revolutionizing information retrieval, question answering, summarization, and code generation tasks. However, in addition to confidently presenting factually inaccurate information at times (known…

Artificial Intelligence · Computer Science 2023-04-26 Henry Gilbert , Michael Sandborn , Douglas C. Schmidt , Jesse Spencer-Smith , Jules White

On the Compressibility of Quantized Large Language Models

Deploying Large Language Models (LLMs) on edge or mobile devices offers significant benefits, such as enhanced data privacy and real-time processing capabilities. However, it also faces critical challenges due to the substantial memory…

Machine Learning · Computer Science 2024-05-07 Yu Mao , Weilan Wang , Hongchao Du , Nan Guan , Chun Jason Xue

Lossless Compression of Large Language Model-Generated Text via Next-Token Prediction

As large language models (LLMs) continue to be deployed and utilized across domains, the volume of LLM-generated data is growing rapidly. This trend highlights the increasing importance of effective and lossless compression for such data in…

Machine Learning · Computer Science 2025-05-13 Yu Mao , Holger Pirk , Chun Jason Xue

Adaptive Feature-based Low-Rank Compression of Large Language Models via Bayesian Optimization

In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank…

Computation and Language · Computer Science 2025-02-25 Yixin Ji , Yang Xiang , Juntao Li , Qingrong Xia , Zi Ye , Xinyu Duan , Zhefeng Wang , Kehai Chen , Min Zhang

Radio: Rate-Distortion Optimization for Large Language Model Compression

In recent years, the compression of large language models (LLMs) has emerged as a key problem in facilitating LLM deployment on resource-limited devices, reducing compute costs, and mitigating the environmental footprint due to large-scale…

Machine Learning · Computer Science 2025-05-07 Sean I. Young

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to…

Computation and Language · Computer Science 2024-11-01 Ge Yang , Changyi He , Jinyang Guo , Jianyu Wu , Yifu Ding , Aishan Liu , Haotong Qin , Pengliang Ji , Xianglong Liu

LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models

Large language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs…

Computation and Language · Computer Science 2023-12-07 Huiqiang Jiang , Qianhui Wu , Chin-Yew Lin , Yuqing Yang , Lili Qiu

When Dimensionality Hurts: The Role of LLM Embedding Compression for Noisy Regression Tasks

Large language models (LLMs) have shown remarkable success in language modelling due to scaling laws found in model size and the hidden dimension of the model's text representation. Yet, we demonstrate that compressed representations of…

Computation and Language · Computer Science 2025-02-05 Felix Drinkall , Janet B. Pierrehumbert , Stefan Zohren

Scaling Down, Serving Fast: Compressing and Deploying Efficient LLMs for Recommendation Systems

Large language models (LLMs) have demonstrated remarkable performance across a wide range of industrial applications, from search and recommendation systems to generative tasks. Although scaling laws indicate that larger models generally…

Information Retrieval · Computer Science 2025-10-28 Kayhan Behdin , Ata Fatahibaarzi , Qingquan Song , Yun Dai , Aman Gupta , Zhipeng Wang , Shao Tang , Hejian Sang , Gregory Dexter , Sirou Zhu , Siyu Zhu , Tejas Dharamsi , Vignesh Kothapalli , Zhoutong Fu , Yihan Cao , Pin-Lun Hsu , Fedor Borisyuk , Natesh Pillai , Luke Simon , Rahul Mazumder

DaMoC: Efficiently Selecting the Optimal Large Language Model for Fine-tuning Domain Tasks Based on Data and Model Compression

Large language models (LLMs) excel in general tasks but struggle with domain-specific ones, requiring fine-tuning with specific data. With many open-source LLMs available, selecting the best model for fine-tuning downstream tasks is…

Computation and Language · Computer Science 2025-09-05 Wei Huang , Huang Wei , Yinggui Wang

Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

Compression Laws for Large Language Models

We introduce compression laws for language language models (LLMs). While recent scaling laws have sought to understand how LLMs scale with respect to model size, pre-training data, and computational resources, we focus on understanding how…

Computation and Language · Computer Science 2025-04-08 Ayan Sengupta , Siddhant Chaudhary , Tanmoy Chakraborty

Sentence-Anchored Gist Compression for Long-Context LLMs

This work investigates context compression for Large Language Models (LLMs) using learned compression tokens to reduce the memory and computational demands of processing long sequences. We demonstrate that pre-trained LLMs can be fine-tuned…

Computation and Language · Computer Science 2025-11-12 Dmitrii Tarasov , Elizaveta Goncharova , Kuznetsov Andrey

The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Compressing large language models (LLMs), often consisting of billions of parameters, provides faster inference, smaller memory footprints, and enables local deployment. Two standard compression techniques are pruning and quantization, with…

Computation and Language · Computer Science 2023-12-05 Satya Sai Srinath Namburi , Makesh Sreedhar , Srinath Srinivasan , Frederic Sala

A Survey of Small Language Models

Small Language Models (SLMs) have become increasingly important due to their efficiency and performance to perform various language tasks with minimal computational resources, making them ideal for various settings including on-device,…

Computation and Language · Computer Science 2024-10-29 Chien Van Nguyen , Xuan Shen , Ryan Aponte , Yu Xia , Samyadeep Basu , Zhengmian Hu , Jian Chen , Mihir Parmar , Sasidhar Kunapuli , Joe Barrow , Junda Wu , Ashish Singh , Yu Wang , Jiuxiang Gu , Franck Dernoncourt , Nesreen K. Ahmed , Nedim Lipka , Ruiyi Zhang , Xiang Chen , Tong Yu , Sungchul Kim , Hanieh Deilamsalehy , Namyong Park , Mike Rimer , Zhehao Zhang , Huanrui Yang , Ryan A. Rossi , Thien Huu Nguyen