Related papers: Contextual Compression Encoding for Large Language…

CCF: A Context Compression Framework for Efficient Long-Sequence Language Modeling

Scaling language models to longer contexts is essential for capturing rich dependencies across extended discourse. However, na\"ive context extension imposes significant computational and memory burdens, often resulting in inefficiencies…

Computation and Language · Computer Science 2026-02-03 Wenhao Li , Bangcheng Sun , Weihao Ye , Tianyi Zhang , Daohai Yu , Fei Chao , Rongrong Ji

Prompt Compression with Context-Aware Sentence Encoding for Fast and Improved LLM Inference

Large language models (LLMs) have triggered a new stream of research focusing on compressing the context length to reduce the computational cost while ensuring the retention of helpful information for LLMs to answer the given question.…

Computation and Language · Computer Science 2024-12-20 Barys Liskavets , Maxim Ushakov , Shuvendu Roy , Mark Klibanov , Ali Etemad , Shane Luke

Enlarging Context with Low Cost: Efficient Arithmetic Coding with Trimmed Convolution

Arithmetic coding is an essential class of coding techniques. One key issue of arithmetic encoding method is to predict the probability of the current coding symbol from its context, i.e., the preceding encoded symbols, which usually can be…

Computer Vision and Pattern Recognition · Computer Science 2018-07-04 Mu Li , Shuhang Gu , David Zhang , Wangmeng Zuo

Context Adaptive Extended Chain Coding for Semantic Map Compression

Semantic maps are increasingly utilized in areas such as robotics, autonomous systems, and extended reality, motivating the investigation of efficient compression methods that preserve structured semantic information. This paper studies…

Image and Video Processing · Electrical Eng. & Systems 2026-03-30 Runyu Yang , Junqi Liao , Hyomin Choi , Fabien Racapé , Ivan V. Bajić

PACE: Post-Causal Entropy Modeling for Learned LiDAR Point Cloud Compression

LiDAR point cloud compression is vital for autonomous systems to handle massive data from high-resolution sensors. While learned entropy modeling built upon octree structures yields high compression gains, it faces two critical bottlenecks:…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Jiahao Zhu , Kang You , Dandan Ding , Zhan Ma

Clustering the Sketch: A Novel Approach to Embedding Table Compression

Embedding tables are used by machine learning systems to work with categorical features. In modern Recommendation Systems, these tables can be very large, necessitating the development of new methods for fitting them in memory, even during…

Machine Learning · Computer Science 2023-10-24 Henry Ling-Hei Tsang , Thomas Dybdahl Ahle

Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration

Contextual Partitioning introduces an innovative approach to enhancing the architectural design of large-scale computational models through the dynamic segmentation of parameters into context-aware regions. This methodology emphasizes the…

Computation and Language · Computer Science 2025-08-11 Offa Kingsleigh , Alfred Abercrombie , David Woolstencroft , Beorhtric Meadowcroft , Marcus Irvin

Efficient and Effective Context-Based Convolutional Entropy Modeling for Image Compression

Precise estimation of the probabilistic structure of natural images plays an essential role in image compression. Despite the recent remarkable success of end-to-end optimized image compression, the latent codes are usually assumed to be…

Image and Video Processing · Electrical Eng. & Systems 2020-06-24 Mu Li , Kede Ma , Jane You , David Zhang , Wangmeng Zuo

Contextual Reinforcement in Multimodal Token Compression for Large Language Models

Effective token compression remains a critical challenge for scaling models to handle increasingly complex and diverse datasets. A novel mechanism based on contextual reinforcement is introduced, dynamically adjusting token importance…

Computation and Language · Computer Science 2025-08-11 Naderdel Piero , Zacharias Cromwell , Nathaniel Wainwright , Matthias Nethercott

SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to…

Machine Learning · Computer Science 2024-12-18 Jing Zhang , Shuzhen Sun , Peng Zhang , Guangxing Cao , Hui Gao , Xindian Ma , Nan Xu , Yuexian Hou

Contextually Structured Token Dependency Encoding for Large Language Models

Token representation strategies within large-scale neural architectures often rely on contextually refined embeddings, yet conventional approaches seldom encode structured relationships explicitly within token interactions. Self-attention…

Computation and Language · Computer Science 2025-03-27 James Blades , Frederick Somerfield , William Langley , Susan Everingham , Maurice Witherington

A Comprehensive Survey of Compression Algorithms for Language Models

How can we compress language models without sacrificing accuracy? The number of compression algorithms for language models is rapidly growing to benefit from remarkable advances of recent language models without side effects due to the…

Computation and Language · Computer Science 2024-01-30 Seungcheol Park , Jaehyeon Choi , Sojin Lee , U Kang

From Context to EDUs: Faithful and Structured Context Compression via Elementary Discourse Unit Decomposition

Managing extensive context remains a critical bottleneck for Large Language Models (LLMs), particularly in applications like long-document question answering and autonomous agents where lengthy inputs incur high computational costs and…

Computation and Language · Computer Science 2026-01-06 Yiqing Zhou , Yu Lei , Shuzheng Si , Qingyan Sun , Wei Wang , Yifei Wu , Hao Wen , Gang Chen , Fanchao Qi , Maosong Sun

AdmTree: Compressing Lengthy Context with Adaptive Semantic Trees

The quadratic complexity of self-attention constrains Large Language Models (LLMs) in processing long contexts, a capability essential for many advanced applications. Context compression aims to alleviate this computational bottleneck while…

Computation and Language · Computer Science 2025-12-05 Yangning Li , Shaoshen Chen , Yinghui Li , Yankai Chen , Hai-Tao Zheng , Hui Wang , Wenhao Jiang , Philip S. Yu

Recurrent Context Compression: Efficiently Expanding the Context Window of LLM

To extend the context length of Transformer-based large language models (LLMs) and improve comprehension capabilities, we often face limitations due to computational resources and bounded memory storage capacity. This work introduces a…

Computation and Language · Computer Science 2024-06-11 Chensen Huang , Guibo Zhu , Xuepeng Wang , Yifei Luo , Guojing Ge , Haoran Chen , Dong Yi , Jinqiao Wang

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

Repository-level code intelligence tasks require large language models (LLMs) to process long, multi-file contexts. Such inputs introduce three challenges: crucial context can be obscured by noise, truncated due to limited windows, and…

Software Engineering · Computer Science 2026-04-16 Jia Feng , Zhanyue Qin , Cuiyun Gao , Ruiqi Wang , Chaozheng Wang , Yingwei Ma , Xiaoyuan Xie

Context Cascade Compression: Exploring the Upper Limits of Text Compression

Million-level token inputs in long-context tasks pose significant computational and memory challenges for Large Language Models (LLMs). Recently, DeepSeek-OCR conducted research into the feasibility of Contexts Optical Compression and…

Computation and Language · Computer Science 2025-12-04 Fanfan Liu , Haibo Qiu

Causal Contextual Prediction for Learned Image Compression

Over the past several years, we have witnessed impressive progress in the field of learned image compression. Recent learned image codecs are commonly based on autoencoders, that first encode an image into low-dimensional latent…

Computer Vision and Pattern Recognition · Computer Science 2021-11-02 Zongyu Guo , Zhizheng Zhang , Runsen Feng , Zhibo Chen

Sequence Shortening for Context-Aware Machine Translation

Context-aware Machine Translation aims to improve translations of sentences by incorporating surrounding sentences as context. Towards this task, two main architectures have been applied, namely single-encoder (based on concatenation) and…

Computation and Language · Computer Science 2024-02-05 Paweł Mąka , Yusuf Can Semerci , Jan Scholtes , Gerasimos Spanakis

Compressed-Sensing-Guided, Inference-Aware Structured Reduction for Large Language Models

Large language models deliver strong generative performance but at the cost of massive parameter counts, memory use, and decoding latency. Prior work has shown that pruning and structured sparsity can preserve accuracy under substantial…

Computation and Language · Computer Science 2026-04-17 Andrew Kiruluta