Related papers: Semantic Compression With Large Language Models

Extending Context Window of Large Language Models via Semantic Compression

Transformer-based Large Language Models (LLMs) often impose limitations on the length of the text input to ensure the generation of fluent and relevant responses. This constraint restricts their applicability in scenarios involving long…

Computation and Language · Computer Science 2023-12-18 Weizhi Fei , Xueyan Niu , Pingyi Zhou , Lu Hou , Bo Bai , Lei Deng , Wei Han

TCRA-LLM: Token Compression Retrieval Augmented Large Language Model for Inference Cost Reduction

Since ChatGPT released its API for public use, the number of applications built on top of commercial large language models (LLMs) increase exponentially. One popular usage of such models is leveraging its in-context learning ability and…

Computation and Language · Computer Science 2023-10-26 Junyi Liu , Liangzhi Li , Tong Xiang , Bowen Wang , Yiming Qian

A Survey on Model Compression for Large Language Models

Large Language Models (LLMs) have transformed natural language processing tasks successfully. Yet, their large size and high computational needs pose challenges for practical use, especially in resource-limited settings. Model compression…

Computation and Language · Computer Science 2024-07-31 Xunyu Zhu , Jian Li , Yong Liu , Can Ma , Weiping Wang

PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics

Evaluating the quality of machine-generated natural language content is a challenging task in Natural Language Processing (NLP). Recently, large language models (LLMs) like GPT-4 have been employed for this purpose, but they are…

Computation and Language · Computer Science 2024-12-23 Daniil Larionov , Steffen Eger

Enhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability

Large language models (LLMs) face significant token efficiency bottlenecks in code generation and logical reasoning tasks, a challenge that directly impacts inference cost and model interpretability. This paper proposes a formal framework…

Artificial Intelligence · Computer Science 2025-02-03 Lumen AI , Tengzhou No. 1 Middle School , Shihao Ji , Zihui Song , Fucheng Zhong , Jisen Jia , Zhaobo Wu , Zheyi Cao , Tianhao Xu

Scaling Up Efficient Small Language Models Serving and Deployment for Semantic Job Search

Large Language Models (LLMs) have demonstrated impressive quality when applied to predictive tasks such as relevance ranking and semantic search. However, deployment of such LLMs remains prohibitively expensive for industry applications…

Information Retrieval · Computer Science 2025-10-28 Kayhan Behdin , Qingquan Song , Sriram Vasudevan , Jian Sheng , Xiaojing Ma , Z Zhou , Chuanrui Zhu , Guoyao Li , Chanh Nguyen , Sayan Ghosh , Hejian Sang , Ata Fatahi Baarzi , Sundara Raman Ramachandran , Xiaoqing Wang , Qing Lan , Vinay Y S , Qi Guo , Caleb Johnson , Zhipeng Wang , Fedor Borisyuk

In Context Learning and Reasoning for Symbolic Regression with Large Language Models

Large Language Models (LLMs) are transformer-based machine learning models that have shown remarkable performance in tasks for which they were not explicitly trained. Here, we explore the potential of LLMs to perform symbolic regression --…

Computation and Language · Computer Science 2026-04-17 Samiha Sharlin , Tyler R. Josephson

SCOPE: A Generative Approach for LLM Prompt Compression

Prompt compression methods enhance the efficiency of Large Language Models (LLMs) and minimize the cost by reducing the length of input context. The goal of prompt compression is to shorten the LLM prompt while maintaining a high generation…

Computation and Language · Computer Science 2025-08-25 Tinghui Zhang , Yifan Wang , Daisy Zhe Wang

Compressing LLMs: The Truth is Rarely Pure and Never Simple

Despite their remarkable achievements, modern Large Language Models (LLMs) face exorbitant computational and memory footprints. Recently, several works have shown significant success in training-free and data-free compression (pruning and…

Computation and Language · Computer Science 2024-03-19 Ajay Jaiswal , Zhe Gan , Xianzhi Du , Bowen Zhang , Zhangyang Wang , Yinfei Yang

An Empirical Study on Prompt Compression for Large Language Models

Prompt engineering enables Large Language Models (LLMs) to perform a variety of tasks. However, lengthy prompts significantly increase computational complexity and economic costs. To address this issue, we study six prompt compression…

Computation and Language · Computer Science 2025-05-02 Zheng Zhang , Jinyi Li , Yihuai Lan , Xiang Wang , Hao Wang

LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models

Large language models (LLMs) have been applied in various applications due to their astonishing capabilities. With advancements in technologies such as chain-of-thought (CoT) prompting and in-context learning (ICL), the prompts fed to LLMs…

Computation and Language · Computer Science 2023-12-07 Huiqiang Jiang , Qianhui Wu , Chin-Yew Lin , Yuqing Yang , Lili Qiu

Large Language Models as Data Preprocessors

Large Language Models (LLMs), typified by OpenAI's GPT, have marked a significant advancement in artificial intelligence. Trained on vast amounts of text data, LLMs are capable of understanding and generating human-like text across a…

Artificial Intelligence · Computer Science 2024-10-29 Haochen Zhang , Yuyang Dong , Chuan Xiao , Masafumi Oyamada

Compressing Large Language Models with Automated Sub-Network Search

Large Language Models (LLMs) demonstrate exceptional reasoning abilities, enabling strong generalization across diverse tasks such as commonsense reasoning and instruction following. However, as LLMs scale, inference costs become…

Computation and Language · Computer Science 2025-02-06 Rhea Sanjay Sukthanker , Benedikt Staffler , Frank Hutter , Aaron Klein

Revisiting Data Compression with Language Modeling

In this report, we investigate the potential use of large language models (LLM's) in the task of data compression. Previous works have demonstrated promising results in applying LLM's towards compressing not only text, but also a wide range…

Computation and Language · Computer Science 2026-01-07 Chen-Han Tsai

LLM-Pruner: On the Structural Pruning of Large Language Models

Large language models (LLMs) have shown remarkable capabilities in language understanding and generation. However, such impressive capability typically comes with a substantial model size, which presents significant challenges in both the…

Computation and Language · Computer Science 2023-09-29 Xinyin Ma , Gongfan Fang , Xinchao Wang

Evaluating Large Language Models on Graphs: Performance Insights and Comparative Analysis

Large Language Models (LLMs) have garnered considerable interest within both academic and industrial. Yet, the application of LLMs to graph data remains under-explored. In this study, we evaluate the capabilities of four LLMs in addressing…

Artificial Intelligence · Computer Science 2023-09-12 Chang Liu , Bo Wu

Lossless data compression by large models

Modern data compression methods are slowly reaching their limits after 80 years of research, millions of papers, and wide range of applications. Yet, the extravagant 6G communication speed requirement raises a major open question for…

Information Theory · Computer Science 2025-05-01 Ziguang Li , Chao Huang , Xuliang Wang , Haibo Hu , Cole Wyeth , Dongbo Bu , Quan Yu , Wen Gao , Xingwu Liu , Ming Li

Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges

Large Language Models (LLMs) have demonstrated impressive zero shot performance on a wide range of NLP tasks, demonstrating the ability to reason and apply commonsense. A relevant application is to use them for creating high quality…

Computation and Language · Computer Science 2024-07-11 Vinay Samuel , Houda Aynaou , Arijit Ghosh Chowdhury , Karthik Venkat Ramanan , Aman Chadha

CMT: A Memory Compression Method for Continual Knowledge Learning of Large Language Models

Large Language Models (LLMs) need to adapt to the continuous changes in data, tasks, and user preferences. Due to their massive size and the high costs associated with training, LLMs are not suitable for frequent retraining. However,…

Computation and Language · Computer Science 2024-12-11 Dongfang Li , Zetian Sun , Xinshuo Hu , Baotian Hu , Min Zhang

Ranking LLMs by compression

We conceptualize the process of understanding as information compression, and propose a method for ranking large language models (LLMs) based on lossless data compression. We demonstrate the equivalence of compression length under…

Artificial Intelligence · Computer Science 2024-06-21 Peijia Guo , Ziguang Li , Haibo Hu , Chao Huang , Ming Li , Rui Zhang