Related papers: Basis Sharing: Cross-Layer Parameter Sharing for L…

SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

Despite significant advancements, the practical deployment of Large Language Models (LLMs) is often hampered by their immense sizes, highlighting the need for effective compression techniques. Singular Value Decomposition (SVD) is a…

Computation and Language · Computer Science 2025-03-18 Xin Wang , Samiul Alam , Zhongwei Wan , Hui Shen , Mi Zhang

Layer-wise dynamic rank for compressing large language models

Large language models (LLMs) have rapidly scaled in size, bringing severe memory and computational challenges that hinder their deployment. Singular Value Decomposition (SVD)-based compression has emerged as an appealing post-training…

Machine Learning · Computer Science 2025-10-07 Zhendong Mi , Bian Sun , Grace Li Zhang , Shaoyi Huang

Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models

Due to the substantial scale of Large Language Models (LLMs), the direct application of conventional compression methodologies proves impractical. The computational demands associated with even minimal gradient updates present challenges,…

Machine Learning · Computer Science 2023-12-13 Arnav Chavan , Nahush Lele , Deepak Gupta

CommonKV: Compressing KV Cache with Cross-layer Parameter Sharing

Large Language Models (LLMs) confront significant memory challenges due to the escalating KV cache with increasing sequence length. As a crucial technique, existing cross-layer KV cache sharing methods either necessitate modified model…

Machine Learning · Computer Science 2025-08-25 Yixuan Wang , Haoyu Qiao , Lujun Li , Qingfu Zhu , Wanxiang Che

Large Language Model Compression via the Nested Activation-Aware Decomposition

In this paper, we tackle the critical challenge of compressing large language models (LLMs) to facilitate their practical deployment and broader adoption. We introduce a novel post-training compression paradigm that focuses on low-rank…

Machine Learning · Computer Science 2025-03-24 Jun Lu , Tianyi Xu , Bill Ding , David Li , Yu Kang

ERC-SVD: Error-Controlled SVD for Large Language Model Compression

Large language models (LLMs) have demonstrated impressive capabilities in a wide range of downstream natural language processing tasks. Nevertheless, their considerable sizes and memory demands hinder practical deployment, underscoring the…

Computation and Language · Computer Science 2026-03-17 Haolei Bai , Siyong Jian , Tuo Liang , Yu Yin , Huan Wang

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitates LLM compression methods for practical deployment. Singular Value Decomposition (SVD) offers a promising solution for LLM…

Computation and Language · Computer Science 2025-03-18 Xin Wang , Yu Zheng , Zhongwei Wan , Mi Zhang

Head-wise Shareable Attention for Large Language Models

Large Language Models (LLMs) suffer from huge number of parameters, which restricts their deployment on edge devices. Weight sharing is one promising solution that encourages weight reuse, effectively reducing memory usage with less…

Computation and Language · Computer Science 2024-10-25 Zouying Cao , Yifei Yang , Hai Zhao

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models

In this paper, we introduce a new post-training compression paradigm for Large Language Models (LLMs) to facilitate their wider adoption. We delve into LLM weight low-rank decomposition, and find that the challenges of this task stem from…

Computation and Language · Computer Science 2025-08-29 Zhihang Yuan , Yuzhang Shang , Yue Song , Dawei Yang , Qiang Wu , Yan Yan , Guangyu Sun

Numerical Optimizations for Weighted Low-rank Estimation on Language Model

Singular value decomposition (SVD) is one of the most popular compression methods that approximate a target matrix with smaller matrices. However, standard SVD treats the parameters within the matrix with equal importance, which is a simple…

Computation and Language · Computer Science 2022-12-19 Ting Hua , Yen-Chang Hsu , Felicity Wang , Qian Lou , Yilin Shen , Hongxia Jin

Beyond Uniform SVD:Dual-Level Optimization across Columns and Modules for LLM Compression

Low-rank decomposition, particularly Singular Value Decomposition (SVD), is a pivotal technique for mitigating the storage and computational demands of Large Language Models (LLMs). However, prevalent SVD-based approaches overlook the…

Machine Learning · Computer Science 2026-01-15 Lin Xv , Xian Gao , Ting Li , Yuzhuo Fu

Optimizing Singular Spectrum for Large Language Model Compression

Large language models (LLMs) have demonstrated remarkable capabilities, yet prohibitive parameter complexity often hinders their deployment. Existing singular value decomposition (SVD) based compression methods simply deem singular values…

Computation and Language · Computer Science 2025-02-24 Dengjie Li , Tiancheng Shen , Yao Zhou , Baisong Yang , Zhongying Liu , Masheng Yang , Bernard Ghanem , Yibo Yang , Yujie Zhong , Ming-Hsuan Yang

CALR: Corrective Adaptive Low-Rank Decomposition for Efficient Large Language Model Layer Compression

Large Language Models (LLMs) present significant deployment challenges due to their immense size and computational requirements. Model compression techniques are essential for making these models practical for resource-constrained…

Machine Learning · Computer Science 2025-08-27 Muchammad Daniyal Kautsar , Afra Majida Hariono , Widyawan , Syukron Abu Ishaq Alfarozi , Kuntpong Woraratpanya

Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding. This makes it challenging to deploy them on devices with limited resources, such as…

Machine Learning · Computer Science 2025-12-22 Yang Li , Daniel Agyei Asante , Changsheng Zhao , Ernie Chang , Yangyang Shi , Vikas Chandra

Language model compression with weighted low-rank factorization

Factorizing a large matrix into small matrices is a popular strategy for model compression. Singular value decomposition (SVD) plays a vital role in this compression strategy, approximating a learned matrix with fewer parameters. However,…

Machine Learning · Computer Science 2022-07-04 Yen-Chang Hsu , Ting Hua , Sungen Chang , Qian Lou , Yilin Shen , Hongxia Jin

DipSVD: Dual-importance Protected SVD for Efficient LLM Compression

The ever-increasing computational demands and deployment costs of large language models (LLMs) have spurred numerous compressing methods. Compared to quantization and unstructured pruning, SVD compression offers superior hardware…

Machine Learning · Computer Science 2025-06-26 Xuan Ding , Rui Sun , Yunjian Zhang , Xiu Yan , Yueqi Zhou , Kaihao Huang , Suzhong Fu , Chuanlong Xie , Yao Zhu

SWSC: Shared Weight for Similar Channel in LLM

Large language models (LLMs) have spurred development in multiple industries. However, the growing number of their parameters brings substantial storage and computing burdens, making it essential to explore model compression techniques for…

Machine Learning · Computer Science 2025-01-16 Binrui Zeng , Yongtao Tang , Xiaodong Liu , Xiaopeng Li

Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing

Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents…

Machine Learning · Computer Science 2025-12-04 Roman Rausch , David Jansen , Sukhbinder Singh , Román Orús

ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression

In the field of large language model (LLM) compression, singular value decomposition (SVD) is a widely studied and adopted low-rank decomposition technique. Since SVD operates exclusively on linear modules, and these modules in LLMs are…

Machine Learning · Computer Science 2025-10-23 Lin Xv , Jingsheng Gao , Xian Gao , Ting Liu , Yuzhuo Fu

Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers

Large Language Models (LLMs) possess outstanding capabilities in addressing various natural language processing (NLP) tasks. However, the sheer size of these models poses challenges in terms of storage, training and inference due to the…

Computation and Language · Computer Science 2025-04-18 Shuzhou Yuan , Ercong Nie , Bolei Ma , Michael Färber