English
Related papers

Related papers: A Model-Driven Lossless Compression Algorithm Resi…

200 papers

It is well-known in the field of lossless data compression that probabilistic next-symbol prediction can be used to compress sequences of symbols. Deep neural networks are able to capture rich dependencies in data, offering a powerful means…

Information Theory · Computer Science 2026-03-10 Aviv Adler , Jennifer Tang

As large language models (LLMs) continue to be deployed and utilized across domains, the volume of LLM-generated data is growing rapidly. This trend highlights the increasing importance of effective and lossless compression for such data in…

Machine Learning · Computer Science 2025-05-13 Yu Mao , Holger Pirk , Chun Jason Xue

In-context learning has established itself as an important learning paradigm for Large Language Models (LLMs). In this paper, we demonstrate that LLMs can learn encoding keys in-context and perform analysis directly on encoded…

Computation and Language · Computer Science 2026-04-16 Andresa Rodrigues de Campos , David Lee , Imry Kissos , Piyush Paritosh

Probabilistic next-token prediction trained using cross-entropy loss is the basis of most large language models. Given a sequence of previous values, next-token prediction assigns a probability to each possible next value in the vocabulary.…

Machine Learning · Statistics 2025-05-19 Jacob Trauger , Ambuj Tewari

With the rise in edge-computing devices, there has been an increasing demand to deploy energy and resource-efficient models. A large body of research has been devoted to developing methods that can reduce the size of the model considerably…

Computer Vision and Pattern Recognition · Computer Science 2021-06-16 Vinu Joseph , Shoaib Ahmed Siddiqui , Aditya Bhaskara , Ganesh Gopalakrishnan , Saurav Muralidharan , Michael Garland , Sheraz Ahmed , Andreas Dengel

We have recently witnessed that ``Intelligence" and `` Compression" are the two sides of the same coin, where the language large model (LLM) with unprecedented intelligence is a general-purpose lossless compressor for various data…

Computer Vision and Pattern Recognition · Computer Science 2024-11-25 Kecheng Chen , Pingping Zhang , Hui Liu , Jie Liu , Yibing Liu , Jiaxin Huang , Shiqi Wang , Hong Yan , Haoliang Li

Learning, prediction, and compression are intimately connected: a model that accurately predicts the next symbol in a sequence can be coupled with a source coder to compress that sequence near its information-theoretic limit. When tokenized…

Information Theory · Computer Science 2026-05-05 Vishnu Teja Kunde , Jean-Francois Chamberland , Krishna R. Narayanan , Jamison Ebert

Language prediction is constrained by informational entropy intrinsic to language, such that there exists a limit to how accurate any language model can become and equivalently a lower bound to language compression. The most efficient…

Computation and Language · Computer Science 2025-11-14 Benjamin L. Badger , Matthew Neligeorge

The rapid growth of digital data has heightened the demand for efficient lossless compression methods. However, existing algorithms exhibit trade-offs: some achieve high compression ratios, others excel in encoding or decoding speed, and…

Information Theory · Computer Science 2025-10-01 Md. Atiqur Rahman , MM Fazle Rabbi

We provide new estimates of an asymptotic upper bound on the entropy of English using the large language model LLaMA-7B as a predictor for the next token given a window of past tokens. This estimate is significantly smaller than currently…

Recent advancements in deep learning have driven significant progress in lossless image compression. With the emergence of Large Language Models (LLMs), preliminary attempts have been made to leverage the extensive prior knowledge embedded…

Image and Video Processing · Electrical Eng. & Systems 2025-02-25 Junhao Du , Chuqin Zhou , Ning Cao , Gang Chen , Yunuo Chen , Zhengxue Cheng , Li Song , Guo Lu , Wenjun Zhang

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

Text representation plays a critical role in tasks like clustering, retrieval, and other downstream applications. With the emergence of large language models (LLMs), there is increasing interest in harnessing their capabilities for this…

Computation and Language · Computer Science 2025-12-25 Yeqin Zhang , Yizheng Zhao , Chen Hu , Binxing Jiao , Daxin Jiang , Ruihang Miao , Cam-Tu Nguyen

How can we compress language models without sacrificing accuracy? The number of compression algorithms for language models is rapidly growing to benefit from remarkable advances of recent language models without side effects due to the…

Computation and Language · Computer Science 2024-01-30 Seungcheol Park , Jaehyeon Choi , Sojin Lee , U Kang

We conceptualize the process of understanding as information compression, and propose a method for ranking large language models (LLMs) based on lossless data compression. We demonstrate the equivalence of compression length under…

Artificial Intelligence · Computer Science 2024-06-21 Peijia Guo , Ziguang Li , Haibo Hu , Chao Huang , Ming Li , Rui Zhang

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to…

Computation and Language · Computer Science 2024-11-01 Ge Yang , Changyi He , Jinyang Guo , Jianyu Wu , Yifu Ding , Aishan Liu , Haotong Qin , Pengliang Ji , Xianglong Liu

Despite the increasing prevalence of large language models (LLMs), we still have a limited understanding of how their representational spaces are structured. This limits our ability to interpret how and what they learn or relate them to…

Data compression continues to evolve, with traditional information theory methods being widely used for compressing text, images, and videos. Recently, there has been growing interest in leveraging Generative AI for predictive compression…

Information Theory · Computer Science 2024-09-24 Swathi Shree Narashiman , Nitin Chandrachoodan

Machine Learning models should ideally be compact and robust. Compactness provides efficiency and comprehensibility whereas robustness provides resilience. Both topics have been studied in recent years but in isolation. Here we present a…

Machine Learning · Computer Science 2021-03-16 Omri Armstrong , Ran Gilad-Bachrach

Learning and compression are driven by the common aim of identifying and exploiting statistical regularities in data, which opens the door for fertile collaboration between these areas. A promising group of compression techniques for…

Machine Learning · Computer Science 2021-02-02 Fernando E. Rosas , Pedro A. M. Mediano , Michael Gastpar
‹ Prev 1 2 3 10 Next ›