English
Related papers

Related papers: Transformers from Compressed Representations

200 papers

Recurrent neural networks have a strong inductive bias towards learning temporally compressed representations, as the entire history of a sequence is represented by a single vector. By contrast, Transformers have little inductive bias…

Machine Learning · Computer Science 2022-10-26 Aniket Didolkar , Kshitij Gupta , Anirudh Goyal , Nitesh B. Gundavarapu , Alex Lamb , Nan Rosemary Ke , Yoshua Bengio

Text encoding is one of the most important steps in Natural Language Processing (NLP). It has been done well by the self-attention mechanism in the current state-of-the-art Transformer encoder, which has brought about significant…

Computation and Language · Computer Science 2021-02-12 Zuchao Li , Zhuosheng Zhang , Hai Zhao , Rui Wang , Kehai Chen , Masao Utiyama , Eiichiro Sumita

Transformers have demonstrated remarkable success across vision, language, and video. Yet, increasing task complexity has led to larger models and more tokens, raising the quadratic cost of self-attention and the overhead of GPU memory…

Computer Vision and Pattern Recognition · Computer Science 2025-08-04 Joonmyung Choi , Sanghyeok Lee , Byungoh Ko , Eunseo Kim , Jihyung Kil , Hyunwoo J. Kim

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to…

Machine Learning · Computer Science 2024-12-18 Jing Zhang , Shuzhen Sun , Peng Zhang , Guangxing Cao , Hui Gao , Xindian Ma , Nan Xu , Yuexian Hou

Text compression shrinks textual data while keeping crucial information, eradicating constraints on storage, bandwidth, and computational efficacy. The integration of lossless compression techniques with transformer-based text decompression…

Computation and Language · Computer Science 2024-12-23 Chowdhury Mofizur Rahman , Mahbub E Sobhani , Anika Tasnim Rodela , Swakkhar Shatabda

In this paper, we contend that a natural objective of representation learning is to compress and transform the distribution of the data, say sets of tokens, towards a low-dimensional Gaussian mixture supported on incoherent subspaces. The…

Machine Learning · Computer Science 2024-09-09 Yaodong Yu , Sam Buchanan , Druv Pai , Tianzhe Chu , Ziyang Wu , Shengbang Tong , Hao Bai , Yuexiang Zhai , Benjamin D. Haeffele , Yi Ma

Inspired by recent work on compression with and for young humans, the success of transform-based approaches to information processing, and the rise of powerful language-based AI, we propose \emph{textual transform coding}. It shares some of…

Information Theory · Computer Science 2023-05-04 Tsachy Weissman

Fine-tuned transformer models have shown superior performances in many natural language tasks. However, the large model size prohibits deploying high-performance transformer models on resource-constrained devices. This paper proposes a…

Computation and Language · Computer Science 2024-10-01 Zi Yang , Samridhi Choudhary , Siegfried Kunzmann , Zheng Zhang

High-energy, large-scale particle colliders in nuclear and high-energy physics generate data at extraordinary rates, reaching up to $1$ terabyte and several petabytes per second, respectively. The development of real-time, high-throughput…

Artificial Intelligence · Computer Science 2024-12-03 Xihaier Luo , Samuel Lurvey , Yi Huang , Yihui Ren , Jin Huang , Byung-Jun Yoon

In contrast to RNNs, which compress their history into a single hidden state, Transformers can attend to all past tokens directly. However, standard Transformers rely solely on the hidden state from the previous layer to represent the…

Machine Learning · Computer Science 2025-05-29 Gleb Gerasimov , Yaroslav Aksenov , Nikita Balagansky , Viacheslav Sinii , Daniil Gavrilov

Efficient lossless compression is essential for minimizing storage costs and transmission overhead while preserving data integrity. Traditional compression techniques, such as dictionary-based and statistical methods, often struggle to…

Artificial Intelligence · Computer Science 2026-02-13 Mahdi Khodabandeh , Ghazal Shabani , Arash Yousefi Jordehi , Seyed Abolghasem Mirroshandel

Compactly representing the visual signals is of fundamental importance in various image/video-centered applications. Although numerous approaches were developed for improving the image and video coding performance by removing the…

Image and Video Processing · Electrical Eng. & Systems 2020-08-14 Rongqun Lin , Linwei Zhu , Shiqi Wang , Sam Kwong

Many real-world datasets are represented as tensors, i.e., multi-dimensional arrays of numerical values. Storing them without compression often requires substantial space, which grows exponentially with the order. While many tensor…

Machine Learning · Computer Science 2023-09-21 Taehyung Kwon , Jihoon Ko , Jinhong Jung , Kijung Shin

Modern visual generative models acquire rich visual knowledge through large-scale training, yet existing visual representations (such as pixels, latents, or tokens) remain external to the model and cannot directly exploit this knowledge for…

Machine Learning · Computer Science 2026-05-25 Zongyu Guo , Jiajun He , Zhaoyang Jia , Xiaoyi Zhang , Jiahao Li , Xiao Li , Bin Li , José Miguel Hernández-Lobato , Yan Lu

Token compression is essential for reducing the computational and memory requirements of transformer models, enabling their deployment in resource-constrained environments. In this work, we propose an efficient and hardware-compatible token…

Computer Vision and Pattern Recognition · Computer Science 2025-04-01 Junzhu Mao , Yang Shen , Jinyang Guo , Yazhou Yao , Xiansheng Hua

Compressed sensing is a signal processing method that acquires data directly in a compressed form. This allows one to make less measurements than what was considered necessary to record a signal, enabling faster or more precise measurement…

Statistical Mechanics · Physics 2012-08-20 Florent Krzakala , Marc Mézard , François Sausset , Yifan Sun , Lenka Zdeborová

We propose an end-to-end learned image compression codec wherein the analysis transform is jointly trained with an object classification task. This study affirms that the compressed latent representation can predict human perceptual…

Computer Vision and Pattern Recognition · Computer Science 2024-01-17 Chen-Hsiu Huang , Ja-Ling Wu

Neural networks using numerous text data have been successfully applied to a variety of tasks. While massive text data is usually compressed using techniques such as grammar compression, almost all of the previous machine learning methods…

Machine Learning · Statistics 2020-03-02 Yoichi Sasaki , Kosuke Akimoto , Takanori Maehara

Compressed sensing techniques enable efficient acquisition and recovery of sparse, high-dimensional data signals via low-dimensional projections. In this work, we propose Uncertainty Autoencoders, a learning framework for unsupervised…

Machine Learning · Statistics 2019-04-15 Aditya Grover , Stefano Ermon

Transformers predict over a representation of a sequence. The same data can be written as bytes, characters, or subword tokens, and these representations may be lossless. Yet, under a fixed context window, they need not expose the same…

Machine Learning · Computer Science 2026-05-14 Amirmehdi Jafari Fesharaki , Mohammadamin Rami , Aslan Tchamkerten
‹ Prev 1 2 3 10 Next ›