Related papers: LongCodeZip: Compress Long Context for Code Langua…

Can Vision-Language Models Handle Long-Context Code? An Empirical Study on Visual Compression

Large Language Models (LLMs) struggle with long-context code due to window limitations. Existing textual code compression methods mitigate this via selective filtering but often disrupt dependency closure, causing semantic fragmentation. To…

Software Engineering · Computer Science 2026-02-03 Jianping Zhong , Guochang Li , Chen Zhi , Junxiao Han , Zhen Qin , Xinkui Zhao , Nan Wang , Shuiguang Deng , Jianwei Yin

CODEPROMPTZIP: Code-specific Prompt Compression for Retrieval-Augmented Generation in Coding Tasks with LMs

Retrieval-Augmented Generation (RAG) enhances coding tasks by incorporating retrieved code examples into prompts. However, lengthy prompts, often exceeding tens of thousands of tokens, introduce challenges related to limited context windows…

Software Engineering · Computer Science 2026-04-13 Pengfei He , Shaowei Wang , Tse-Hsun Chen

FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression

While the language modeling objective has been shown to be deeply connected with compression, it is surprising that modern LLMs are not employed in practical text compression systems. In this paper, we provide an in-depth analysis of neural…

Computation and Language · Computer Science 2024-09-26 Fazal Mittu , Yihuan Bu , Akshat Gupta , Ashok Devireddy , Alp Eren Ozdarendeli , Anant Singh , Gopala Anumanchipalli

CompLLM: Compression for Long Context Q&A

Large Language Models (LLMs) face significant computational challenges when processing long contexts due to the quadratic complexity of self-attention. While soft context compression methods, which map input text to smaller latent…

Computation and Language · Computer Science 2025-09-24 Gabriele Berton , Jayakrishnan Unnikrishnan , Son Tran , Mubarak Shah

Developing Adaptive Context Compression Techniques for Large Language Models (LLMs) in Long-Running Interactions

Large Language Models (LLMs) often experience performance degradation during long-running interactions due to increasing context length, memory saturation, and computational overhead. This paper presents an adaptive context compression…

Computer Vision and Pattern Recognition · Computer Science 2026-04-01 Payal Fofadiya , Sunil Tiwari

On the Effectiveness of Context Compression for Repository-Level Tasks: An Empirical Investigation

Repository-level code intelligence tasks require large language models (LLMs) to process long, multi-file contexts. Such inputs introduce three challenges: crucial context can be obscured by noise, truncated due to limited windows, and…

Software Engineering · Computer Science 2026-04-16 Jia Feng , Zhanyue Qin , Cuiyun Gao , Ruiqi Wang , Chaozheng Wang , Yingwei Ma , Xiaoyuan Xie

FocusLLM: Precise Understanding of Long Context by Dynamic Condensing

Empowering LLMs with the ability to precisely understand long contexts is crucial for many downstream applications. However, handling long contexts with conventional transformer architecture requires substantial training and inference…

Computation and Language · Computer Science 2024-12-24 Zhenyu Li , Yike Zhang , Tengyu Pan , Yutao Sun , Zhichao Duan , Junjie Fang , Rong Han , Zixuan Wang , Jianyong Wang

LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models

Large language models (LLMs) face significant challenges in handling long-context tasks because of their limited effective context window size during pretraining, which restricts their ability to generalize over extended sequences.…

Computation and Language · Computer Science 2024-09-05 Zhiyuan Hu , Yuliang Liu , Jinman Zhao , Suyuchen Wang , Yan Wang , Wei Shen , Qing Gu , Anh Tuan Luu , See-Kiong Ng , Zhiwei Jiang , Bryan Hooi

Less is More: DocString Compression in Code Generation

The widespread use of Large Language Models (LLMs) in software engineering has intensified the need for improved model and resource efficiency. In particular, for neural code generation, LLMs are used to translate function/method signature…

Software Engineering · Computer Science 2025-06-12 Guang Yang , Yu Zhou , Wei Cheng , Xiangyu Zhang , Xiang Chen , Terry Yue Zhuo , Ke Liu , Xin Zhou , David Lo , Taolue Chen

Extending Context Window of Large Language Models via Semantic Compression

Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long…

Computation and Language · Computer Science 2023-12-18 Weizhi Fei , Xueyan Niu , Pingyi Zhou , Lu Hou , Bo Bai , Lei Deng , Wei Han

LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

In long context scenarios, large language models (LLMs) face three main challenges: higher computational cost, performance reduction, and position bias. Research indicates that LLM performance hinges on the density and position of key…

Computation and Language · Computer Science 2024-08-13 Huiqiang Jiang , Qianhui Wu , Xufang Luo , Dongsheng Li , Chin-Yew Lin , Yuqing Yang , Lili Qiu

Dynamic Compressing Prompts for Efficient Inference of Large Language Models

Large Language Models (LLMs) have shown outstanding performance across a variety of tasks, partly due to advanced prompting techniques. However, these techniques often require lengthy prompts, which increase computational costs and can…

Computation and Language · Computer Science 2025-04-16 Jinwu Hu , Wei Zhang , Yufeng Wang , Yu Hu , Bin Xiao , Mingkui Tan , Qing Du

Adapting LLMs for Efficient Context Processing through Soft Prompt Compression

The rapid advancement of Large Language Models (LLMs) has inaugurated a transformative epoch in natural language processing, fostering unprecedented proficiency in text generation, comprehension, and contextual scrutiny. Nevertheless,…

Machine Learning · Computer Science 2024-04-22 Cangqing Wang , Yutian Yang , Ruisi Li , Dan Sun , Ruicong Cai , Yuzhu Zhang , Chengqian Fu , Lillian Floyd

Perception Compressor: A Training-Free Prompt Compression Framework in Long Context Scenarios

Large language models (LLMs) demonstrate exceptional capabilities in various scenarios. However, they suffer from much redundant information and are sensitive to the position of key information in long context scenarios. To address these…

Computation and Language · Computer Science 2025-02-11 Jiwei Tang , Jin Xu , Tingwei Lu , Zhicheng Zhang , Yiming Zhao , Lin Hai , Hai-Tao Zheng

zip2zip: Inference-Time Adaptive Tokenization via Online Compression

Tokenization efficiency plays a critical role in the performance and cost of large language models (LLMs), yet most models rely on static tokenizers optimized on general-purpose corpora. These tokenizers' fixed vocabularies often fail to…

Computation and Language · Computer Science 2025-10-27 Saibo Geng , Nathan Ranchin , Yunzhen yao , Maxime Peyrard , Chris Wendler , Michael Gastpar , Robert West

PRISM: Efficient Long-Range Reasoning With Short-Context LLMs

Long-range tasks demand reasoning over long inputs. However, existing solutions are limited, e.g., long-context models require large compute budgets, parameter-efficient fine-tuning (PEFT) needs training data, and retrieval-augmented…

Artificial Intelligence · Computer Science 2025-08-26 Dulhan Jayalath , James Bradley Wendt , Nicholas Monath , Sandeep Tata , Beliz Gunel

Fix the Structural Bottleneck: Context Compression via Explicit Information Transmission

Long-context LLM agents often struggle with growing token, memory, and latency costs, making efficient context compression essential for practical deployment. Existing LLM-as-a-compressor methods remain noticeably inferior to using the full…

Computation and Language · Computer Science 2026-05-22 Jiangnan Ye , Hanqi Yan , Zhenyi Shen , Heng Chang , Ye Mao , Yulan He

An Empirical Study on Prompt Compression for Large Language Models

Prompt engineering enables Large Language Models (LLMs) to perform a variety of tasks. However, lengthy prompts significantly increase computational complexity and economic costs. To address this issue, we study six prompt compression…

Computation and Language · Computer Science 2025-05-02 Zheng Zhang , Jinyi Li , Yihuai Lan , Xiang Wang , Hao Wang

SemanticZip: A Pilot Framework for Lossy Text Compression with LLMs as Semantic Decompressors

Text compression for large language model (LLM) systems is usually framed as token deletion, retrieval, summarization, or exact reconstruction. We study a more aggressive but explicitly lossy setting: compress text into compact codes that…

Machine Learning · Computer Science 2026-05-26 Natalia Trukhina , Vadim Vashkelis

Compressing Lengthy Context With UltraGist

Compressing lengthy context is a critical but technically challenging problem. In this paper, we propose a new method called UltraGist, which is distinguished for its high-quality compression of lengthy context due to the innovative design…

Computation and Language · Computer Science 2024-10-14 Peitian Zhang , Zheng Liu , Shitao Xiao , Ninglu Shao , Qiwei Ye , Zhicheng Dou